Potential Scale Reduction and chains=1


Hi, would anyone know how Stan calculates the Rhat value using only one chain?


I remember reading somewhere that for effective sample size calculations a single chain will be split in half. It’s probably the same for R hat.


Perhaps you are remembering from the Stan User Manual in the section entitled “Initialization and Convergence Monitoring”.


Wow! This is news to me. Thank you very much. I found something about this on page 479 of the Stan version 2.12.0 manual.


However, a reference about the subject would be interesting.


Don’t use Stan version 2.12.0 because the sampler was (slightly) broken, although the theory of split R-hat does not change in later versions. I think the only published explanation of it is in the BDA3 textbook, but Andrew might be working on a paper about it.


It’s in Gelman et al.'s Bayesian Data Analysis. And Andrew’s trying to put together a paper on split R-hat and ESS calculations that discount for non-convergence.


Thank you, prof. Carpenter.