Understanding `loo` results for non-standard `Contaminated Controls` model

jsocolar · December 1, 2021, 5:54pm

Hi Roberto,

I haven’t found time to look at your specification of the likelihood, but I think I can clear up some of the issues around r-hat and Pareto k.

First, the Pareto k diagnostic is not a convergence diagnostic. Neither is cross-validation. More on that below.

Second, the materials regarding r-hat that you link in your second post point out a problem with the traditional r-hat metric, and then they present a new and improved r-hat that solves the problem. All of the Stan ecosystem (at least the regularly maintained parts) use this new r-hat diagnostic. With that said, r-hat still isn’t always sensitive to lack of convergence. Practically speaking, however, if you have a model without divergences, without bfmi warnings, with acceptable effective sample sizes, and with low r-hat, then you can feel reasonably confident about convergence, particularly if you’ve run a not-too-small number of chains with inits that are overdispersed relative to the posterior. To feel even more confident, you might run SBC (simulation-based calibration) on your model.

Third, cross validation is good for evaluating model adequacy or misspecification, but has nothing to do with convergence. While poor fit and misspecification sometimes obstruct convergence in practice, they are conceptually orthogonal to whether the MCMC estimators have converged.

Fourth, the Pareto k diagnostic is a diagnostic to flag when the PSIS-LOO approximation to true leave-one-out cross-validation is failing. In some contexts, it can also be indicative of misspecification. Again, it is never a convergence diagnostic. For more about PSIS-LOO and Pareto k, see here LOO package glossary — loo-glossary • loo

Topic		Replies	Views
Alternative to LOO for simulation studies Modeling loo , posterior-predictive , model-comparison	14	1170	November 17, 2020
Loo issues on simulated meta-analysis data Modeling loo , meta-analysis	13	1071	May 6, 2019
Bernoulli model with many LOO-PIT values at 1 Modeling loo	2	707	November 13, 2020
Various questions about interpretation of loo results General loo , interpret-results	2	1404	August 1, 2019
LOO Model Comparison Alternative Modeling rstan , techniques , loo , cmdstanr	3	79	March 27, 2025

Understanding `loo` results for non-standard `Contaminated Controls` model

Related topics