Questions on the individualized formulas in HLM

Dear community,

I guess my questions are pretty basic, but I didn’t manage to figure them out by myself. They all around individual formulas from a hierarchical model.

Say I have a model with three continuous variables and several observations for each subject.

library(tidyverse)
library(datawizard)
library(brms)

set.seed(1234)

Data <- data.frame(DV = rnorm(80),
                   A  = rnorm(80),
                   B  = rnorm(80),
                   C  = rnorm(80),
                   Subj = as.factor(rep(c(1:5),16))
                   )

Data2Model <- Data %>%
  group_by(Subj) %>%
  mutate(
    A = standardise(A), # Center variables per subject
    B = standardise(B),
    C = standardise(C)
  ) %>%
  ungroup()

Model <- 
  brm(
  data = Data2Model,
  formula = DV ~ A + B + C + (A+B+C|Subj),
  chains = 4,
  cores = 4
)

As far as I understand, I can get the individual formulas for each subject using this:

coef(Model) %>% as.data.frame() %>% select(contains("Estimate"))

#   Subj.Estimate.Intercept Subj.Estimate.A Subj.Estimate.B Subj.Estimate.C
# 1             0.022104393    0.0009816632     -0.13212726      0.47990893
# 2             0.012057911    0.0131331115     -0.06024109      0.07725161
# 3            -0.034164734    0.0717136684     -0.06014023      0.33624930
# 4             0.017360800    0.0356747686      0.05482032     -0.18452892
# 5            -0.004253234    0.0328254212     -0.13192382      0.01854751

The formula for Subject 1 can be described as:

\def\Plus{\texttt{+}} \def\Minus{\texttt{-}} Y_{Subject1} = 0.022\Plus0.0009X_A\Minus0.1321X_B\Plus0.4799X_C

and for Subject 2:

Y_{Subject2} = 0.012\Plus0.0131X_A\Minus0.0602X_B\Plus0.0772X_C

My questions are:

  1. Is there a measure for how these formulas are different from one another? Is that, in practice, the adjusted ICC?

  2. Is there a way to inspect whether the individual formulas given by the HLM predict better than if I would extract the formulas from 5 single regression models (one for each subject)? I would like to measure what is the contribution of the shrinkage and what are the benefits of the use of the HLM to compute the coefficients?

  3. It is more like a cross-validation question (actually, maybe they all are). I am interested in how much the individual formulas are individualised. How can I inspect whether the formula’s coefficients of subject 1 will better predict the data of subject 1 and not the coefficients formula of subject 2 for the data of subject 1?

Thanks a lot,

SH

  1. Assuming a 2-level model with a single grouping variable - yes, the various ICC measures can be seen as a measure of how much the group-level regression formulas are different, globally. Or you can look at the \sigma_{\text{slope}_{i}} as a local, per-slope variance measure.

  2. The HLM “formulas” are biased due to shrinkage (in the bias-variance sense), so their in-sample performance is generally expected to be worse than 5 individual 1-level fixed effect model. You can look at some out-of-sample performance measure - perhaps LOOIC can give this? I am not too familiar with how that algo works.

  3. This is the opposite of Q2, but should probably also use some CV based measures. You can compare the out-of-sample (holding out each of the 5 subjects this time and fitting on the 4 remaining) predictions of a fixed effect model based on the leave-on-subject-out model. Alternatively, you can compare predictions for each subject (A) at the population level vs (B) at the group level:

brms::posterior_epred(model, re_formula = NA) # population level

brms::posterior_epred(model, re_formula = NULL) # group level
1 Like