Marginal and Conditional R2 for Stan's GLMMs

Nakagawa (2017) suggests a measure of “absolute” fit, that would 1) be relatively close to what a traditional R2, 2) would dissociate the “variance explained” by the fixed effects and random effects and 3) is generalizable to non-linear models.

This method appears to be implemented in the MuMIn package for lme4 models.

Is there any plan, or attempts, or does it seem complicated to port this method to rstanarm’s models? Or is there any reason not to use such measures?

See Rstanarm: extracting variance components

Thanks for the link. Should I understand that under the Bayesian framework, I can simply fit a “null” model with the random effects only, get its R2, and compare it to the R2 of the full model (including fixed and random effects)?

Well, it is not usually advisable to fit a model like stan_glmer(y ~ (… | group)) because that forces the common coefficients to be zero. Even if you were going to do something like that, this variance component / R^2 stuff is not the right way to compare models. We have ELPD (in loo), we have stacking weights (in loo), we have posterior probabilities for models (in bridgesampling). Any of those would be better for model comparison than things that are not intended for model comparison.

Ok I see. Again, this was more to get an absolute index of “explanatory” power of the model and of its fixed effects in a way that it would be understandable to a frequentist reviewer or supervisor. But as you said, it might be worth to take the time explaining the core difference of the Bayesian framework rather than trying to mimic the frequentist approach. Thanks.

I understand the point about model comparison, but is there any reasonable way to measure how much of the explained variance is due to the random effects and how much is due to the fixed effects? Would comparing bayes_R2(model, re.form = NA) vs bayes_R2(model, re.form = NA) be a reasonable thing to do?

bayes_R2(model, re.form = NULL) vs bayes_R2(model, re.form = NA) is fine, but it shouldn’t be used to “test” a null hypothesis that the group-specific intercept and coefficients are irrelevant.