Traceplot vs Rhat for convergence

sonicking · April 3, 2021, 4:10pm

I have the following traceplot. The 4 chains look converged to me.

But if I check the Rhat, I get:

mean se_mean sd 2.5% 97.5% n_eff Rhat
beta0[1] -2.24 0.02 0.07 -2.36 -2.11 13 1.14
beta0[2] -2.75 0.03 0.09 -2.91 -2.56 11 1.15

Is the model converged? Thanks.

Simon_Wilms · April 4, 2021, 1:18pm

I know this isn’t the answer for your question but wouldn’t it be easier to decide with more than 400 iterations and 200 warm-ups? Maybe you can post Rhats and trace plots after a larger number of iterations?

Apart from that I do not know enough about the computation of Rhat values to decide how serious this is. With respect to assumption testing in a frequentist context I believe graphical/visual inspections are often better than cut-off values. I am however not sure if this should be generalized to Rhat values. In fact here (R: Convergence and efficiency diagnostics for Markov Chains) it says: “We recommend running at least four chains by default and only using the sample if R-hat is less than 1.05.” 1.15 really is far beyond this cut-off.

ahartikainen · April 4, 2021, 1:55pm

Traceplots most times look good when you zoom out enough.

How are the histograms for the samples (for different chains)? (Also, traceplot without warmup might work better).

sonicking · April 4, 2021, 10:49pm

Replying to both:

After running longer, there is a slight improvement in Rhat.

mean se_mean sd 2.5% 97.5% n_eff Rhat
beta0[1] -2.23 0.01 0.07 -2.36 -2.10 69 1.06
beta0[2] -2.75 0.02 0.09 -2.93 -2.57 18 1.12

I also have the traceplot without-warmup

Finally histograms for the samples by chain

I guess I want to know when fellow researchers encounter this type of situation, how do you decide? I don’t know how “cut-and-dry” the Rhat<1.05 rule is. Your inputs are much appreciated.

srossell · April 5, 2021, 8:55am

Hi,

From visual inspection, I would say that, your chains are not mixing well. The within chain variance needs to be similar to the between chains variance, Rhat is a measure of how similar these variances are. In my experience an acceptable Rhat accompanies visually well mixed traces.

That said, I am frequently lost in how to “repair” a model in which chains don’t mix well.

Good luck

ahartikainen · April 5, 2021, 9:04am

If I’m not wrong, there is also some autocorrelation in your draws. What are ess_bulk and ess_tail (and what is ndraws)

sonicking · April 6, 2021, 1:43am

Thank you for everyone’s reply. I used a less-diffused prior and the chains mix well now.

avehtari · April 7, 2021, 6:54pm

And this paper suggest even <1.01 rule! All these suggestions are ad-hoc starting points. Rhat and ESS (previously known as n_eff) are useful as they are scale free, that is, when checking them for many parameters you don’t need to compare them to the standard deviation of the marginal posterior of that parameter or to the domain knowledge.

If you don’t like arbitrariness of suggested Rhat and ESS thresholds, you can always in the end look at the Monte Carlo standard error (MCSE) for the quantities of interest and use the domain knowledge to assess whether the accuracy is sufficient. Stan and ArviZ computations use Rhat and ESS to compute MCSE, so you will get the benefit of the multi-chain diagnostic in MCSE, too, although MCSE estimates might be slightly overoptimistic (say half too small) if ESS (n_eff) is small.

Great!

EDIT: changed the paper link to point to doi.org

Richard_Erickson · April 3, 2024, 5:39pm

@avehtari, your link " And this paper …" is dead. Can you update this please?

avehtari · April 3, 2024, 6:10pm

Done

Richard_Erickson · April 3, 2024, 6:35pm

thank you! And here is the DOI incase their site gets changed again. 10.1214/20-BA1221

avehtari · April 3, 2024, 6:57pm

The link I added is pointing to the doi https://doi.org/10.1214/20-BA1221 and not directly to the BA site

Topic		Replies	Views
Understanding Rhat with respect to the results from Bayesian Hierarchical modeling General	2	1316	March 19, 2020
Rhat < 1 (as low as 9.94e-01). Why? General	6	4003	June 19, 2019
Divergent transitions after warm up & Rhat General	5	4088	July 28, 2017
Divergent Transitions with R_hat < 1.1 Modeling fitting-issues , performance , shinystan	14	2442	June 13, 2019
False warning about max Rhat? - rstan General	3	3032	August 11, 2019

Traceplot vs Rhat for convergence

Related topics