Does CmdStan report outdated convergence diagnostics?

Hi all

Is there a reason CmdStan still reports BDA3 convergence diagnostics?
(To be clear, I’m talking about the unwrapped cmdstan build — not CmdStanR or CmdStanPy. My workflow is, regrettably perhaps, MATLAB-based.)

This is the output I get from stansummary with CmdStan 2.27.0 :

Inference for Stan model: example_RL_model
 chains: each with iter=(100,100,100,100); warmup=(0,0,0,0); thin=(1,1,1,1); 400 iterations saved.
Warmup took (1.2, 1.1, 0.98, 1.2) seconds, 4.5 seconds total
Sampling took (2.2, 1.8, 2.2, 2.2) seconds, 8.4 seconds total
                      Mean        MCSE      StdDev          5%         50%         95%       N_Eff     N_Eff/s       R_hat
lp__           -7.1740e+02  6.1392e-01  6.5080e+00 -7.2798e+02 -7.1715e+02 -7.0727e+02  1.1238e+02  1.3436e+01  1.0148e+00
accept_stat__   9.2191e-01  1.2115e-02  9.7309e-02  7.1243e-01  9.6110e-01  9.9792e-01  6.4516e+01  7.7135e+00  1.0277e+00
stepsize__      7.1840e-02  4.2839e-03  6.1513e-03  6.4888e-02  7.2025e-02  8.1537e-02  2.0619e+00  2.4652e-01  3.1216e+14
treedepth__     5.7100e+00  1.8184e-01  4.7059e-01  5.0000e+00  6.0000e+00  6.0000e+00  6.6972e+00  8.0071e-01  1.2042e+00
n_leapfrog__    6.1360e+01  2.0530e+00  2.0054e+01  3.1000e+01  6.3000e+01  9.5000e+01  9.5419e+01  1.1408e+01  1.0365e+00
divergent__     0.0000e+00         nan  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00         nan         nan         nan
energy__        7.4440e+02  7.4908e-01  8.1130e+00  7.3166e+02  7.4450e+02  7.5710e+02  1.1730e+02  1.4024e+01  1.0119e+00
shape           3.2046e+00  8.1131e-02  1.2449e+00  1.5459e+00  3.0225e+00  5.5376e+00  2.3544e+02  2.8149e+01  9.9395e-01
rate            5.4027e-01  1.8179e-02  2.4991e-01  2.3185e-01  4.9105e-01  1.0234e+00  1.8898e+02  2.2594e+01  9.9860e-01
a               1.2192e+00  3.1937e-02  4.2291e-01  6.3073e-01  1.1527e+00  2.0131e+00  1.7535e+02  2.0965e+01  1.0150e+00
b               5.1111e+00  9.6788e-02  1.9649e+00  2.5316e+00  4.9225e+00  8.5504e+00  4.1214e+02  4.9276e+01  9.9924e-01
beta[1]         5.0525e+00  1.5618e-01  3.3383e+00  1.1457e+00  4.2911e+00  1.0455e+01  4.5687e+02  5.4624e+01  1.0006e+00
beta[2]         6.7553e+00  2.1137e-01  3.7318e+00  2.5350e+00  5.9702e+00  1.3600e+01  3.1170e+02  3.7266e+01  9.9781e-01

Since there’s still n_eff, rather than ess_bulk, ess_tail, etc., I guessed that CmdStan is still spitting out BDA3 diagnostics, rather than the new diagnostics as per Vehtari et al., 2020 (arXiv link).

I confirmed this for Rhat with my own implementation (in MATLAB). I match CmdStan’s R_hat value exactly when I use split Rhat (as per BDA3), but never when I use folded split Rhat (as per Vehtari).

(This is a problem for me as I’m trying to check my implementation in MATLAB of both BDA3 ESS and the new ESS forumlae (which, to my knowledge, has not yet been done). So I’m currently relying on CmdStan’s output to check my work. I will likely make a post about my implementation issues later, but thought this deserved its own thread.)

For reference, I am working with:

  • Ubuntu 20.04
  • MATLAB R2021a
  • MATLABStan & Trinity (but note that both interfaces just trigger external shell commands to run CmdStan & read in CmdStan’s CSV output — no built-in diagnostics exist/are used)
  • CmdStan 2.26.1 & 2.27.0

Looking forward to your replies, thanks everyone

2 Likes

@avehtari

There were disagreements on how to change the CSV, what to compute and display by default, and details on C++ implementation. A related Discourse thread and stan-dev/stan issue.

2 Likes

Ah sure, it makes sense to wait on changes to CmdStan until these sort of cascading implementation issues are settled.

Thank you for these links, Aki, they are both very helpful and great reads!

1 Like