I want to estimate the proportion of variance in an outcome y
attributable to a level of grouping (coach
) in a mixed model. This is similar to analyses of psychotherapy where patients are nested in therapists and analysts estimate the therapist effect.
I have secondary data from a trial with complex randomization and partial/crossed nesting.
- Villages were randomly assigned to treatment or control. This randomization was blocked by village type (
host
). - Households in the treatment villages were randomly assigned to groups of ~25 people, and these groups were randomly assigned to 1 of 4 arms: arms 1-3 were variants of the core program, while arm 4 was a spillover control arm, meaning these were people who lived in treated villages but were not themselves in the program. Therefore,
trt==1
if arm is 1, 2, or 3;trt==0
for people in control villages or arm 4 (spillover control) - Coaches were randomly assigned to arms 1 to 3. Coaches crossed villages (meaning they could have groups in multiple villages).
- Everyone was assigned to a group, but groups were only meaningful for people in arms 1-3. There were “groups” of people assigned to arm 4 (spillover control), but there were no group activities because these folks did not receive any programming.
Since group and coach were not meaningful for control individuals (people from control villages and folks in arm 4), I assigned them singleton clusters for coach
and group
following Candlish et al. (2018).
This is the model I tried to fit using some simulated data:
library(tidyverse)
library(brms)
library(performance)
df <- read_csv("https://raw.githubusercontent.com/ericpgreen/g2r_coaching/main/reports/sim.csv")
fit <- brms::brm(y ~
trt + # 1 if arm 1, 2, or 3
# 0 if pure control or arm 4
# (arm 4 is control in a trt village)
coaching_individual + # 1 if arm 1, individual trt
coaching_group + # 1 if arm 2, group trt
no_asset + # 1 if arm 3, ind trt, no $$
spillover_control + # 1 if arm 4, no program
host + # 1 if host community
(1 | village) + # people nested in villages
(0 + trt | coach) + # people nested in coaches
# (but only when trt==1)
(0 + trt | group), # people nested in groups
# (but groups only meaningful trt==1)
data = df)
I thought I could use performance::icc()
to estimate ICC for coach
, but it complained about the random slopes and only produces an estimate for village
.
performance::icc(fit, by_group = TRUE)
I tried to use performance::variance_decomposition()
, but it only returns an overall result.
I think the answer is that the ICC is not possible to calculate when there are random slopes. Is this true, or is there a different approach I might take?
If it’s not possible, I’m wondering about the logic of subsetting the data to just trt==1
(thus removing the partial nesting and random slopes) and focusing on the coach effect among program recipients who received different variants of the program.
df2 <- df %>%
filter(trt==1)
fit2 <- brms::brm(y ~
coaching_group + # 1 if arm 2, group trt
no_asset + # 1 if arm 3, ind trt, no $$
host + # 1 if host community
(1 | village) + # people nested in villages
(1 | coach) + # people nested in coaches
# (but only when trt==1)
(1 | group), # people nested in groups
# (but groups only meaningful trt==1)
data = df2)