In multilevel models, it is typical to model the coefficients for particular groups (e.g., effects of interventions in particular schools) as drawn from a normal population distribution.
Within the brms environment (or more generally within RStan), what options are available for evaluating the adequacy of this modeling assumption? Are there specific kinds of graphical predictive checks that are particularly useful for this question?
Any advice is greatly appreciated! Thanks.
You could extract the coefficients with
coef() and then plot them to see how closely they resemble a normal distribution.
But there is more to it. Even if the normal distribution might not perfectly fit the distribution of the varying effects, it still does it’s job which is to provide shrinkage and thus be less exited about extreme patterns in the data. I believe there are some simulations studies out there (using frequentist models) that show that mean coefficients and other hyperparameters are unlikely to be mis-represented even if the distribution of the varying effects is way different than normal.
Interesting, thanks Paul!
I know that in BDA section 17.4, there is a sensitivity analysis reported where the model is re-fit with a t-distribution for the school effects, yielding essentially unchanged school-specific estimates.
Presumably, if the chosen distribution were extremely poor (e.g., the model uses a normal distribution to describe the heterogeneity, but the true distribution is bimodal) you would get sensitivity of the estimates to the prior, and you would also see other problems with the model, right?
For example using the “stat_grouped” method within pp_check, I am guessing that the model-simulated distributions of each person’s mean yrep would poorly match the actual observed y means. Is that correct? Thanks so much!
It’s possible, perhaps likely that you will see problems, but that will surely depend on your data and model at hand.