Calculating variance explained by each predictor and clustering variable in brms

jmgirard · October 3, 2019, 4:51am

I am using brms to fit a model with nested clustering (i.e., each target is rated by each rater):

  brm(
    formula = rating ~ 1 + x1 + x2 + x3 + (1 | rater / target),
    prior = c(...),
    data = data
  )

If possible, I would like to quantify how much of the variance in rating was explained by each of the predictors: x1, x2, and x3 as well as by the clustering variables rater and target. I get the sense from searching here that the key may be using posterior_linpred() or posterior_predict() but I’m hoping someone can confirm this. Also, would it change things if I had interaction effects and/or random slopes?

An explanation would be ideal, but relevant references would also be appreciated. Thanks!

bbbales2 · October 4, 2019, 12:29pm

I don’t want to leave you hanging on this question, but I don’t know the answer. You might find something interesting in here though: http://www.stat.columbia.edu/~gelman/research/unpublished/bayes_R2.pdf

paul.buerkner · October 4, 2019, 1:44pm

This problem has no proper answer if the terms are correlated with each other as, in this case, there is no unique way to assign variance explained to predictors separately (in general).

Are you directly interested in the varianeces explained or do you have some other goal which you just try to answer by means of variances explained?

jmgirard · October 5, 2019, 12:20am

Thanks for the quick responses. I guess this isn’t as straightforward as I was hoping.

I am trying to implement a variance decomposition a la Generalizability Theory. Something akin to: https://my.vanderbilt.edu/jasonrights/software/r2mlm/

Rights, J.D., & Sterba, S.K. (in press). Quantifying explained variance in multilevel models: An integrative framework for defining R-squared measures. Psychological Methods.

But if that’s not possible, perhaps you could help me achieve a similar goal: (1) How much variance was explained by each clustering variable: rater and target? (2) How important was each predictor?

For (1), perhaps performance::icc() is relevant. I get this output but am not fully sure how to interpret it. This also doesn’t seem to separate out the two clustering variables (perhaps this isn’t possible).

# Random Effect Variances and ICC

Conditioned on: all random effects

## Variance Ratio (comparable to ICC)
Ratio: 0.77  CI 95%: [0.76 0.78]

## Variances of Posterior Predicted Distribution
Conditioned on fixed effects: 0.81  CI 95%: [0.77 0.84]
Conditioned on rand. effects: 3.47  CI 95%: [3.42 3.52]

## Difference in Variances
Difference: 2.66  CI 95%: [2.60 2.72]

For (2), I guess I can interpret the regression coefficients but I’d like (if possible) to be able to compare the “importance” of the predictors to that of the clustering variables. Explained variance seemed like a possible approach to this, but maybe I’m not thinking this through enough.

Thanks for helping me work through this. This community is great.

jmgirard · October 7, 2019, 4:28am

I’ve been reading a lot more about frequentist and Bayesian approaches to Generalizability theory and found the following papers very informative:

Jiang, Z. (2018). Using the Linear Mixed-Effect Model Framework to Estimate Generalizability Variance Components in R. Methodology , 14 (3), 133–142. https://doi.org/10/gfxf43

LoPilato, A. C., Carter, N. T., & Wang, M. (2015). Updating Generalizability Theory in Management Research: Bayesian Estimation of Variance Components. Journal of Management , 41 (2), 692–717. https://doi.org/10/gcp2sf

Based on this, it looks like I should reformulate my model to focus on sources of variance:

brm(
  formula = rating ~ 1 + (1 | rater) + (1 | target) + (1 | type) + 
    (1 | rater:target) + (1 | rater:type) + (1 | target:type), 
  prior = c(...), 
  data = data
)

Then I should be able to derive the variance component for each source from the posterior draws and calculate the desired Generalizability coefficients. Will follow up once my model converges and I can play with this some more.

lauren · October 7, 2019, 5:46pm

Was just about to say you want to model independently for G theory. Sounds like cool work, look forward to the update! :)

jmgirard · October 10, 2019, 8:30pm

Ok, success! I estimated a multilevel model for a (r \times t \times s) G study, which means I needed random intercepts for r, t, s, r:t, r:s, and t:s. I ended up using informative cauchy() priors on the SD parameters and an informative normal() prior on the intercept.

library(tidyverse)
library(brms)
library(tidybayes)

bgt <- 
  brm(
    formula = bf(rating ~ (1 | rater_id) + (1 | target_id) + (1 | condition) + 
      (1 | rater_id:target_id) + (1 | rater_id:stimulus_type) + 
      (1 | target_id:stimulus_type)),
    prior = c(...),
    data = dat, 
    ...
  )

Then, for each posterior draw, I needed to pull out the intercept for each SD parameter and square it to get the variance estimate. These estimates could then be summed to calculate the total variance and then the proportion of the total variance explained by each. Finally, I summarized across all draws by calculating the posterior medians and highest density continuous intervals.

bgt %>% 
  tidy_draws() %>% 
  select(starts_with("sd_"), sigma) %>% 
  transmute_all(.funs = list(sq = ~(. ^ 2))) %>% 
  mutate(total_var = rowSums(.)) %>% 
  mutate_at(.vars = vars(-total_var), 
            .funs = list(pct = ~(. / total_var))) %>% 
  map_df(.f = ~ median_hdci(., .width = 0.95), .id = "component") %>% 
  mutate(
    component = str_remove(component, "sd_"),
    component = str_remove(component, "__Intercept_sq"),
    component = str_replace(component, "sigma_sq", "residual")
  )

These estimates are quite similar to the REML ones I got from a frequentist G study, but have the added benefits of including interval estimates and avoiding negative variance estimates.

jmgirard · May 17, 2023, 10:51pm

I ended up building out a package for this stuff: GitHub - jmgirard/varde: R Functions for Variance Decomposition

Also, based on ten Hove et al. (2020), I switched from half cauchy to half student_t priors on the intercept SD parameters, switched from the posterior median to the posterior mode for point estimates, and added random inits.

ten Hove, D., Jorgensen, T. D., & van der Ark, L. A. (2020). Comparing Hyperprior Distributions to Estimate Variance Components for Interrater Reliability Coefficients. In M. Wiberg, D. Molenaar, J. González, U. Böckenholt, & J.-S. Kim (Eds.), Quantitative Psychology (Vol. 322, pp. 79–93). Springer International Publishing. Comparing Hyperprior Distributions to Estimate Variance Components for Interrater Reliability Coefficients | SpringerLink

Isabela_Castro · July 25, 2023, 1:07pm

This package can extract the variance partition of the fixed effects predictors also?

jmgirard · April 21, 2025, 5:58pm

No but see r2mlm.

Topic		Replies	Views
Estimate variance explained with GAM brmsfit General brms	0	476	August 2, 2022
Brms: How to get the ICC or VPC value for a multi-level negative binomial model? brms	3	1910	September 15, 2021
How to calculate VPC in brms brms fitting-issues , specification	1	1000	December 14, 2020
What is the most appropriate approach to compute an ICC for a brms MLM? Interfaces brms	0	358	February 4, 2024
Including known variance in brm model brms	2	441	February 12, 2019

Calculating variance explained by each predictor and clustering variable in brms

Related topics