R hat with only one chain

Hi all,

I’m using the rstan package and noticed that it outputs rhat even if I run only one chain. To my understanding, calculating R-hat requires at least two chains. I wonder how r hat is being calculated in rstan under this condition.

Thank you!

I’m pretty sure that Stan uses split-R-hat, so if there are m chains, you’ll get 2m pieces which can be used for R-hat. I will always run multiple chains, but if you have just 1 chain, it is still split in two.

3 Likes

Thanks for the information! If I run 1 chain with 2000 iterations, does that mean the 2000 iterations will be split into two sets of 1000 iterations each, or will I get 2000 samples in total?

If you run 1 chain with 1000 warmup and 1000 saved iterations, then the 1000 saved iterations will be saved into two sets when computing split R-hat. But they are put back together when the simulations are sent out of Stan. In general, if you run m chains for n saved iterations each, Stan will compute split R-hat by splitting them into 2m chains, each of length n/2, and the Stan will put them back and return m chains each of length n. I guess that Stan will also return the warmup iterations if you ask it, but usually I don’t do anything with them.

Yes, Andrew’s right—Stan uses split-\widehat{R}.

You can get the precise definition we use in both the Stan Reference Manual and in Gelman et al.'s Bayesian Data Analysis (free pdf on the book’s home page). Here’s the relevant section of the Reference Manual:

https://mc-stan.org/docs/reference-manual/analysis.html

The more elaborate version we use now in RStan and ArviZ is based on this paper:

This will be coming soon to CmdStan and hence CmdStanPy and CmdStanR.

It’s actually already in CmdStanR because we use the posterior package to compute diagnostics. So fit$summary() will give you the new Rhat.

To be more precise, Bayesian Data Analysis does not describe the version which has been used for years in Stan. The paper Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion) does describe in addition to the rank-normalized version also the Rhat version which has been in used for years in Stan and mentions the differences to BDA3 version which is not really used anywhere.

1 Like