Is the effective sample size estimator using the variogram in BDA3 always valid?

The does not seem to be guaranteed positive definite:


(However, I do expect the estimator to be positive definite most of the time, because of the practice of cutting off the sum after running into a negative autocovariance.)

What are the advantages and disadvantages to using this estimator rather than the sample autocovariance?

1 Like


I don’t know about the advantages, but based on my experiments it is less accurate than what is used in Stan (see 16.4 Effective sample size | Stan Reference Manual, Section 3.2 in Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion), and comparison of estimators Comparison of MCMC effective sample size estimators). I don’t know any package that would be using this estimator mentioned in BDA3.

Years ago, older Stan versions did cut the sum to not include the first negative autocovariance, but that was a mistake and caused missing superefficient (for posterior mean) antithetic chains. BDA3 describes correctly that we first compute sums of even and odd lag autocorrelations, and don’t include the first such term that is negative. The difference between negative autocorrelation in single lag vs negative sum of even and odd lag, is important.


Funnily enough, Turing’s package MCMCChains includes it as an option!

That being said, was the improvement caused by using the sample autocovariance instead of the variogram, or from the other changes (rank-normalization, mainly)? I don’t see the variogram mentioned in the comparisons.

Autocovariance was used before BDA3, so I would say that variogram was a not well tested sidestep. The comparison is not just autocovariance vs. variogram, but multichain autocovariance vs. multichain variogram. Multichain autocovariance is better than multichain variogram,and was better years ago before rank-normalization. Rank-normalization was not included in the experiments in Comparison of MCMC effective sample size estimators, as rank-normalization is used only for the new robust generic Rhat and Bulk-ESS, but when computing MCSE we need to use the expectant specific ESS. Variogram was not included in the experiments shown in Comparison of MCMC effective sample size estimators, because I assumed that no-one is using it. Based on what you say, it’s not the default method in MCMCChains, which is good. Maybe you can run some comparisons?

1 Like