Hi, would anyone know how Stan calculates the Rhat value using only one chain?
I remember reading somewhere that for effective sample size calculations a single chain will be split in half. It’s probably the same for R hat.
Perhaps you are remembering from the Stan User Manual in the section entitled “Initialization and Convergence Monitoring”.
Wow! This is news to me. Thank you very much. I found something about this on page 479 of the Stan version 2.12.0 manual.
However, a reference about the subject would be interesting.
Don’t use Stan version 2.12.0 because the sampler was (slightly) broken, although the theory of split R-hat does not change in later versions. I think the only published explanation of it is in the BDA3 textbook, but Andrew might be working on a paper about it.
It’s in Gelman et al.'s Bayesian Data Analysis. And Andrew’s trying to put together a paper on split R-hat and ESS calculations that discount for non-convergence.
Thank you, prof. Carpenter.