So I’ve been recommending simulation-based calibration (SBC) as the go-to method to validate your models and try to use it as much in my practice as I can. In the past few days I noticed there is a class of bugs that don’t get caught by SBC: the model ignoring data. Also ignoring a parameter in the likelihood part may lead to only very small deviances in SBC plots. Maybe that is obvious to some, but it wasn’t obvious to me, so I’m sharing the experience.
Remember that the main idea behind SBC is that when you repeatedly simulate data from a model’s prior and fit the model to simulated data, you end up recovering the prior. If your model accidentally ignores all the data, it will recover the prior and pass SBC checks very easily. The same holds if your model ignores only part of the data but is otherwise correct - for example, I had an indexing error that made my model ignore the last datapoint for each group of observations and it passed SBC with flying colors.
To a lesser extent, SBC plots can be mostly fine when you ignore a parameter (in my case one term in the linear predictor part of a model). This manifested as a very slight skew in the standard deviation term, but other than that, the SBC plots looked nice.
I was able to notice the “ignore all data case” and “ignore one parameter” easily because in addition to the SBC histograms I always plot a scatter of true value vs. posterior mean/median, which gives a sense of the precision with which the model estimates the parameter so when the model ignores data/parameters it manifests as a lack of correlation in this plot.
Here is an SBC plot + scatter from an OK model (sorry for just showing 50 SBC steps, but it takes time to run and I have work to do :-) ):
And here are the same plots when
beta parameter has been left out of the model likelihood:
You’ll notice that the SBC plots look roughly the same but the scatter shows that the posterior median is not influenced by the true value in the second case…
Also the data ignoring problems become apparent when doing posterior predictive checks, so yay for PP checks!
Hope that helps somebody :-)