I am looking to sensibly partition the variance from a mixed-effects logit model fit via stan. I understand typical Bayesian R^2 (Bayesian R2 and LOO-R2) is:
I am however interested in obtaining estimates of the share of variance explained by each model component - so suppose a set of fixed effects with variance \sigma^2_f and a random effect with variance \sigma^2_a, then I’d like to calculate:
Classic Bayesian approaches for logistic regression calculate \sigma^2_{\epsilon} at the predicted-probability level, as opposed to the latent logit level. I try to stick to this and calculate the component-specific variance with an approach which uses the predicted probabilities at each level (similarly to what is suggested in the old Gelman (2006) paper in the context of linear models). For every posterior simulation, I calculate:
But this has the very undesirable property to produce R^2 s that do not sum to 1 ( and at times are larger than 1 on their own), regardless of which method for calculating the residual variance is chosen.
The Nakagawa et al. (2017) paper offers an interesting solution using the corrected delta method to estimate the residual variance at the latent (logit) level. Based on other posts in this forum (e.g. here) I have seen people have a familiarity with the paper. However it is formulated from a frequentist prospective, and i’m having trouble applying this to a Bayesian model.
The formula they derive for the latent-scale variance is:
where \beta_0 is the model intercept; \sigma^2_\tau is the sum of all random-effect variance components in the model, and \hat{p} is the best guess for the expected value of our bernoulli-distributed outcome variable y.
If I calculate the R^2 replacing \sigma^2_\tau with the variance of my linear predictor Var{(\mu)} , I get the following:
Which would seem to suggest this is a sensible approach which both gives R2 which is consistent with other R2 measures, and allows me to partition the variance as I want to.
I have the following questions which I hope folks on the forum may help with:
-
The Nakagawa formula is very insistent on this distinction between random and fixed effects - namely \sigma^2_\tau is meant to be the sum of the variances of random effect components (not including the fixed-effects variance) - but from a Bayesian prospective I do not think there is any actual difference – everything is a ‘random’ effect (or better a ‘modeled’ effect) – so I thought it would be sensible to also include the variance of the fixed effect components in \sigma^2_\tau – hence my choice to replace it with Var{(\mu)} .
Is this a sensible approach in your estimation ? -
Relatedly, the paper suggests it is possible to replace \beta_0 with the fixed effects – namely \beta_0 + \sum^K_{k=1}\beta_k x_{i,k}. Here I do not understand why I could not / should not include the random effects in this predictor — again I’m not sure about how the conceptual difference between frequentist and bayesian random effects interacts with the Nakagawa formula;
-
Is it somehow possible to obtain the variance partitioning I want - namely a set of R^2 for each model component of interest which also sum to 1 - within the loo framework ?
Thanks in advance for your help and time.