Residual diagnostics in MCMC - based multilevel regression models

I’ve recently embarked on fitting multilevel regression models in the Bayesian framework, using a MCMC algorithm (brms in R actually).

I believe I have understood how to diagnose convergence of the estimation process (trace, geweke plot, autocorrelation, posterior distribution…).

One of the thing that strikes me in the Bayesian framework is that much effort seems to devoted to do those diagnostics, whereas very little appears to be done in terms of checking the residuals of the fitted model.

Long story short: I will probably have to present my model to the “classic” econometrician and he will expect me to discuss the residuals. Of course, there is a problem of defining “residuals” in Bayesian regression. So should I simply calculate the fitted model values (Estimate column from fitted(brm) function), replicate the multilevel model into LM and analyze the differences as typical residuals? Or should I focus on posterior predictive checks / loo validation?

Hi–to start I recommend taking a look at chapters 6 and 7 of BDA3. Chapter 6 discusses model checking, and chapter 7 discusses predictive model evaluation.

3 Likes

We recommend our revised R-hat metrics and computing effective sample size. It’s hard to learn much from squinting at autocorrelation plots or traceplots.

It depends on the type of Bayesian. The hardcore subjectivists just believe you put down a subjective prior and your posterior is what it is. So you only have to test you computed it, not that it makes sense, because you’ve assumed it makes sense.

With Stan, we strongly recommend checking not just the residuals, but match to data and match to held out data using posterior predictive checks and cross-validation, respectively (“BDA3” is Gelman et al.'s book Bayesian Data Analysis, which has chapters about these things).

I’m also most of the way done with adding chapters on all this to our user’s guide.

2 Likes