Can I obtain bayes_R2 for reach level using rstanarm?

Carolina_Schwedhelm · January 18, 2018, 9:17am

I have a three level hierarchical data structure with normal distributed outcome. I successfully fitted my model using stan_glmer, but need the R2 for the explained variance for each of the levels. In rstanarm, I was only able to get one R2 value for the whole model using bayes_R2.

Is there a way to obtain this for each level, in the way of Gelman’s 2004 “Bayesian measures of explained variance and pooling in multilevel (hierarchical) models” paper?

Alternatively, is there a way to obtain the level-specific R2 from a lme4 fitted model?

Thanks a lot in advance for your help!
Carolina

bgoodri · January 18, 2018, 3:44pm

Not from the bayes_R2 function. But the concept is just

ratio <- apply(posterior_linpred(post, transform = TRUE), 1, FUN = var) / 
               apply(posterior_predict(post), 1, FUN = var))

so you could do the variance calculation over whatever subgroups you want.

Carolina_Schwedhelm · January 19, 2018, 4:56pm

Thanks a lot!
I did this but am not getting the expected results, so I must be doing something wrong when trying it subgroup-specific.

For getting the subgroup-specific posterior_linpred and posterior_predict, I did the following (example for highest group level):

post_id <- posterior_predict(fit, re.form = ~ (1|id))
postlin_id <- posterior_linpred(fit, transform = TRUE, re.form = ~ (1|id))

Where fit is my stan_glmer object for the whole model.

Then I computed the R2:
rsquared.id <- 1 - mean (ratio)

The results that I got were way greater than expected.

Hope my mistake is easy to spot.
Thanks a lot for your help!

bgoodri · January 19, 2018, 6:43pm

And ratio is apply(postlin_id, 1, var) / apply(post_id, 1, var)?

Carolina_Schwedhelm · January 20, 2018, 12:17am

Yes, this is the ratio I specified

Carolina_Schwedhelm · January 23, 2018, 1:17pm

I found the code for the function bayes_R2.stanreg (https://github.com/jgabry/bayes_R2/blob/master/bayes_R2.tex) to be:

bayes_R2 <- function(fit) {
y <- get_y(fit)
yhat <- posterior_linpred(fit)
e <- -1 * sweep(yhat, 2, y)
var_yhat <- apply(yhat, 1, var)
var_e <- apply(e, 1, var)
var_yhat / (var_yhat + var_e)
}

and was also able to apply it to specific levels/subgroups (same result as bayes_R2(fit,re.form= ~ (1|identr)).

However, these results (for full model and also for separate levels) were different from the R2 using this ratio:

ratio <- apply(posterior_linpred(post, transform = TRUE), 1, FUN = var) /
apply(posterior_predict(post), 1, FUN = var))

According to Gelman 2004 (https://www.econstor.eu/bitstream/10419/142498/1/EERI_RP_2004_04.pdf), R2 is obtained by subtracting this ratio from 1. However, results are not consistent.

Am I measuring different things here? and how can I interpret them? Which method would be the correct one for calculating the expalined variance by levels in a hierarchical model?

Thanks for your help!

bgoodri · January 23, 2018, 4:23pm

The question of what is the R^2 within a subgroup is a bit different from the question how much variability does a term add to the model (which is what I think you are asking now). You can do as.data.frame(fit) to get the posterior distribution of the standard deviation in the intercepts across levels of identr and compare that to the standard deviation of the measurement errors for example. In more complicated models than this, it can get tricky, which is why I always suggest just doing things the Bayesian way

PPD_full <- posterior_predict(fit) # what the full model predicts
var_PPD_full <- apply(PPD_full, 1, var) # the variance over the observations
PPD_part <- posterior_predict(fit, re.form = NA) # what the model predicts without (1|identr)
var_PPD_part <- apply(PPD_part, 1, var) # the variance over the observations
ratio <- 1 - var_PPD_part / var_PPD_full # the contribution of (1|identr)
summary(ratio) # or hist(ratio)

Carolina_Schwedhelm · January 26, 2018, 11:44am

Thanks a lot for your answer. I am sorry for the confusion. My analyses consist of two parts. First, calculating how much variability a term adds to the model (the ICC in the empty model, just with my random intercepts) I have these results already. Then for the second part, I want to quantify the variance explained by my predictors in each level (I think this is the R2 within my subgroups).

My dataset is actually structured in 3 levels: participants (identr), meal type (fco_code), and data level.

Is the following the correct use of the R2 by subgroups measure?

#R2 level 3 (identr)
post_identr ← posterior_predict(fit,re.form = ~ (1|identr))
postlin_identr ← posterior_linpred(fit,transform=TRUE,re.form = ~ (1|identr))
r2_identr ← mean(apply(postlin_identr,1,FUN=var))/ mean(apply(post_identr,1,FUN=var))

#R2 level 2 (fco_code)
post_fco ← posterior_predict(fit,re.form= ~ (1|fco_code:identr))
postlin_fco ← posterior_linpred(fit,transform= TRUE, re.form = ~ (1|fco_code:identr))
r2_fco ← mean(apply(postlin_fco,1,FUN=var))/mean(apply(post_fco,1,FUN=var))

#R2 level 1 (data level)
postlin_data ← posterior_linpred(fit,transform=TRUE, re.form=NA)
post_data ← posterior_predict(fit, re.form=NA)
r2_data ← mean(apply(postlin_data,1,FUN=var))/ mean(apply(post_data,1,FUN=var))

For example in level 3, does the result give only the variance explained by the predictors in my fit model within this subgroup (or does this include also the observation level)?

And in the case of level 2, this R2 means the variance explained by the predictors only on the fco_code level? (note that these are random intercepts within level 3).

My doubts arise because I am getting unexpected results: among the predictors in the model are sex, which I would expect to explain an important part of the variance in the participant (identr) level, however, the value comes out to be around 0.08.

Thank you!!

Topic		Replies	Views
Marginal and Conditional R2 for Stan's GLMMs Modeling	6	1417	February 1, 2019
Extracting overall coefficient estimates from rstanarm rstanarm	2	1149	October 19, 2017
Question about hierarchical effects rstanarm	2	726	February 15, 2018
bayes_R2 and conditioning on random effects Modeling rstanarm	20	3021	April 7, 2021
Rstanarm: How to define informative priors from previous studies using stan_glm? Modeling rstan , priors , rstanarm	7	713	March 30, 2021

Can I obtain bayes_R2 for reach level using rstanarm?

Related Topics