Calculating variance explained by each predictor and clustering variable in brms

I am using brms to fit a model with nested clustering (i.e., each target is rated by each rater):

  brm(
    formula = rating ~ 1 + x1 + x2 + x3 + (1 | rater / target),
    prior = c(...),
    data = data
  )

If possible, I would like to quantify how much of the variance in rating was explained by each of the predictors: x1, x2, and x3 as well as by the clustering variables rater and target. I get the sense from searching here that the key may be using posterior_linpred() or posterior_predict() but I’m hoping someone can confirm this. Also, would it change things if I had interaction effects and/or random slopes?

An explanation would be ideal, but relevant references would also be appreciated. Thanks!

I don’t want to leave you hanging on this question, but I don’t know the answer. You might find something interesting in here though: http://www.stat.columbia.edu/~gelman/research/unpublished/bayes_R2.pdf

This problem has no proper answer if the terms are correlated with each other as, in this case, there is no unique way to assign variance explained to predictors separately (in general).

Are you directly interested in the varianeces explained or do you have some other goal which you just try to answer by means of variances explained?

Thanks for the quick responses. I guess this isn’t as straightforward as I was hoping.

I am trying to implement a variance decomposition a la Generalizability Theory. Something akin to: https://my.vanderbilt.edu/jasonrights/software/r2mlm/

Rights, J.D., & Sterba, S.K. (in press). Quantifying explained variance in multilevel models: An integrative framework for defining R-squared measures. Psychological Methods.

But if that’s not possible, perhaps you could help me achieve a similar goal: (1) How much variance was explained by each clustering variable: rater and target? (2) How important was each predictor?

For (1), perhaps performance::icc() is relevant. I get this output but am not fully sure how to interpret it. This also doesn’t seem to separate out the two clustering variables (perhaps this isn’t possible).

# Random Effect Variances and ICC

Conditioned on: all random effects

## Variance Ratio (comparable to ICC)
Ratio: 0.77  CI 95%: [0.76 0.78]

## Variances of Posterior Predicted Distribution
Conditioned on fixed effects: 0.81  CI 95%: [0.77 0.84]
Conditioned on rand. effects: 3.47  CI 95%: [3.42 3.52]

## Difference in Variances
Difference: 2.66  CI 95%: [2.60 2.72]

For (2), I guess I can interpret the regression coefficients but I’d like (if possible) to be able to compare the “importance” of the predictors to that of the clustering variables. Explained variance seemed like a possible approach to this, but maybe I’m not thinking this through enough.

Thanks for helping me work through this. This community is great.

1 Like

I’ve been reading a lot more about frequentist and Bayesian approaches to Generalizability theory and found the following papers very informative:

Jiang, Z. (2018). Using the Linear Mixed-Effect Model Framework to Estimate Generalizability Variance Components in R. Methodology , 14 (3), 133–142. https://doi.org/10/gfxf43

LoPilato, A. C., Carter, N. T., & Wang, M. (2015). Updating Generalizability Theory in Management Research: Bayesian Estimation of Variance Components. Journal of Management , 41 (2), 692–717. https://doi.org/10/gcp2sf

Based on this, it looks like I should reformulate my model to focus on sources of variance:

brm(
  formula = rating ~ 1 + (1 | rater) + (1 | target) + (1 | type) + 
    (1 | rater:target) + (1 | rater:type) + (1 | target:type), 
  prior = c(...), 
  data = data
)

Then I should be able to derive the variance component for each source from the posterior draws and calculate the desired Generalizability coefficients. Will follow up once my model converges and I can play with this some more.

1 Like

Was just about to say you want to model independently for G theory. Sounds like cool work, look forward to the update! :)

2 Likes

Ok, success! I estimated a multilevel model for a (r \times t \times s) G study, which means I needed random intercepts for r, t, s, r:t, r:s, and t:s. I ended up using informative cauchy() priors on the SD parameters and an informative normal() prior on the intercept.

library(tidyverse)
library(brms)
library(tidybayes)

bgt <- 
  brm(
    formula = bf(rating ~ (1 | rater_id) + (1 | target_id) + (1 | condition) + 
      (1 | rater_id:target_id) + (1 | rater_id:stimulus_type) + 
      (1 | target_id:stimulus_type)),
    prior = c(...),
    data = dat, 
    ...
  )

Then, for each posterior draw, I needed to pull out the intercept for each SD parameter and square it to get the variance estimate. These estimates could then be summed to calculate the total variance and then the proportion of the total variance explained by each. Finally, I summarized across all draws by calculating the posterior medians and highest density continuous intervals.

bgt %>% 
  tidy_draws() %>% 
  select(starts_with("sd_"), sigma) %>% 
  transmute_all(.funs = list(sq = ~(. ^ 2))) %>% 
  mutate(total_var = rowSums(.)) %>% 
  mutate_at(.vars = vars(-total_var), 
            .funs = list(pct = ~(. / total_var))) %>% 
  map_df(.f = ~ median_hdci(., .width = 0.95), .id = "component") %>% 
  mutate(
    component = str_remove(component, "sd_"),
    component = str_remove(component, "__Intercept_sq"),
    component = str_replace(component, "sigma_sq", "residual")
  )


These estimates are quite similar to the REML ones I got from a frequentist G study, but have the added benefits of including interval estimates and avoiding negative variance estimates.

6 Likes

I ended up building out a package for this stuff: GitHub - jmgirard/varde: R Functions for Variance Decomposition

Also, based on ten Hove et al. (2020), I switched from half cauchy to half student_t priors on the intercept SD parameters, switched from the posterior median to the posterior mode for point estimates, and added random inits.

ten Hove, D., Jorgensen, T. D., & van der Ark, L. A. (2020). Comparing Hyperprior Distributions to Estimate Variance Components for Interrater Reliability Coefficients. In M. Wiberg, D. Molenaar, J. González, U. Böckenholt, & J.-S. Kim (Eds.), Quantitative Psychology (Vol. 322, pp. 79–93). Springer International Publishing. Comparing Hyperprior Distributions to Estimate Variance Components for Interrater Reliability Coefficients | SpringerLink

2 Likes

This package can extract the variance partition of the fixed effects predictors also?