I’m using the rstan package and noticed that it outputs rhat even if I run only one chain. To my understanding, calculating R-hat requires at least two chains. I wonder how r hat is being calculated in rstan under this condition.
I’m pretty sure that Stan uses split-R-hat, so if there are m chains, you’ll get 2m pieces which can be used for R-hat. I will always run multiple chains, but if you have just 1 chain, it is still split in two.
Thanks for the information! If I run 1 chain with 2000 iterations, does that mean the 2000 iterations will be split into two sets of 1000 iterations each, or will I get 2000 samples in total?
If you run 1 chain with 1000 warmup and 1000 saved iterations, then the 1000 saved iterations will be saved into two sets when computing split R-hat. But they are put back together when the simulations are sent out of Stan. In general, if you run m chains for n saved iterations each, Stan will compute split R-hat by splitting them into 2m chains, each of length n/2, and the Stan will put them back and return m chains each of length n. I guess that Stan will also return the warmup iterations if you ask it, but usually I don’t do anything with them.
You can get the precise definition we use in both the Stan Reference Manual and in Gelman et al.'s Bayesian Data Analysis (free pdf on the book’s home page). Here’s the relevant section of the Reference Manual:
I guess that depends on what you call “Stan”. It’s still being used in CmdStan and CmdStanPy, and I only use CmdStanPy these days. I have no idea how far CmdStanR has drifted from mirroring CmdStanPy. [Edit: Oops—just saw @Jonah’s post that says they’re using the convergence monitoring from posterior in cmdstanr.]
Of course, you can drop in a different posterior analysis package like ArviZ, but I really can’t stand ArviZ (specifically, its dependencies on analysis packages and its god-object design [the latter it shares with all the stats packages in R like lm/glm]).
@avehtari took the expedient of updating the convergence monitoring for some of our interfaces, but not the core C++ packages. As a result, some of our interfaces remain out of sync. @mitzimorris is now taking the time to build performant implementations in C++ and will be wrapping them in CmdStan so they’ll be available to CmdStanPy. My guess is that CmdStanR will continue to forge its own path and ArviZ will continue to use its own implementations. Nobody, including me, is prioritizing keeping our interfaces in sync.
Thankfully the additions that @mitzimorris mentioned should bring the interfaces more into alignment on diagnostics. It’s not so much that we wanted to forge our own path with CmdStanR (although you’re right that we did), it’s just that we had the improved diagnostics available in R so it seemed like we should go ahead and use them. It wasn’t clear at the time when they would end up making it into CmdStan. It’s great that we’ll have them in stansummary going forward!
Sorry, I just realize my mistake was not what I call “Stan”, but thinking also about the effective sample size (ESS). BDA3 Section 11.5 describes a version of ESS (which includes the Rhat computation) which is not used in CmdStan.