Model Checking/Validating Best Practices

Hi All,

I want to get a discussion going on how Stan users on the forum go about checking and validating their models and best practices.I know this can be case-specific and information can be found in the various user manuals, but I’m still curious,

Checking

At a minimum, a good model should produce posterior estimates with \hat{R}\leq 1.01 and large effective sample size (ESS). SE_mean should also ideally be near zero, ensuring that the simulation has been run long enough. If not, then the number of chain iterations can be increased from the default of 2000, to 2500, say.

Chain mixing is checked visually via traceplots of parameter estimates. Plots should look like “fuzzy caterpillars”.

In this 2017 case, study, Michael Betancourt, walks through a robust statistical workflow using Stan:

https://mc-stan.org/users/documentation/case-studies/rstan_workflow.html

which also includes examining aspects of the HMC sampler itself such as the tree depth, E-BFMI (Energy Bayesian Fraction of Missing Information), and divergence.

Finally, model reparameterization may be needed to fix common issues.

Validating

Model validation typically consists of drawing simulated values from the model via posterior predictive checks. Adequacy of priors is analogously done via prior predictive checking.

Simulation-based calibration (SBC) can further shed light on model performance.

Question

This is only a summary, but I’m wondering what others think. Care to weigh in?

1 Like

In addition to @betanalpha’s workflow case study there’s a later paper on workflow by Gelman et al.:

I would not recommend inspecting trace plots by eye. In addition to high ESS, what you want to see is that doubling the length of Markov chains doubles the ESS.

I’d break the validation down into validating the algorithm’s calibration on simulated data (SBC), evaluating the prior (prior predictive checks), evaluating the fit to data (posterior predictive checks), and fit to new data (cross-validation). There’s a part of the User’s Guide that goes over how to code all of these in Stan.

I think the general approach around here to internal validation (e.g. some form of cross validation) strategies probably loosely follows Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models • loo and its associated papers.