Stan's ESS (and monotone autocorrelations)

I am trying to code up Stan’s effective sample size estimator as defined in Sections 16.4 of the reference manual here and basically wondered whether I’d understood the recipe correctly. Does the order of things go like:

  • take only post warm-up samples
  • split chains into two (so M = 2 * number of chains run)
  • calculate W, var_plus, s_m_sq and rho_t,m
  • use the above to calculate rho_t
  • calculate the series P_t = rho_2t + rho_2t+1 for t = 1,…,m, where m is the largest integer such that P_t > 0 for all t <= m
  • create monotone version of P_t (I call it P_t_mon) by setting P_t_mon = P_t, if P_t-P_t-1<0; or P_t_mon = P_t-1_mon if otherwise (i.e. make it equal to the min of the current state of P_t_mon)
  • calculate tau = -1 + 2 * sum(P_t)
  • calculate ess = N * M / tau

Is this right? If I follow the above recipe with outputs from Stan I get values a bit different from the ones that are in the fit summary, so suspect I must be being dense somewhere (apologies in advance)…Ben

Not sure, but I am sure it is calculated like


on split chains.

See also [1903.08008] Rank-normalization, folding, and localization: An improved $\widehat{R}$ for assessing convergence of MCMC.

Which interface and which version? After I know that information, I can check that the reference manual matches the code (CmdStan, RStan, and PyStan, should have the same by default, while monitor in RStan and CmdStanR using posterior package use the rank normalized version).

@avehtari and @bgoodri thank you. I am using pystan 2.19.1.1 (I don’t know how to check Stan’s version within pystan? I know rstan but, for a particular project am using pystan.)

In any case, it sounds like I should be using the rank normalised Rhat after reading the paper @avehtari suggests. I found your repo here: https://github.com/avehtari/rhat_ess/blob/master/code/monitornew.R Is this the current ground truth that I should be working off for these functions?

PyStan version follows Stan version.

import pystan
print(pystan.__version__)

There is rank normalized rhat in ArviZ package.

import arviz as az
summary = az.summary(fit) # pandas DataFrame
print(summary)

edit. https://github.com/arviz-devs/arviz/blob/master/arviz/stats/diagnostics.py

print(az.rhat(fit))
1 Like

If you prefer Python, see @ahartikainen’s reply. Please report if those help, or if you think the documentation could be improved.