Part 2: Seeking examples of using brms to estimate treatment effects in experiment with multiple respondents per cluster

I have a design where families, consisting of 2-6 members, are assigned to an intervention or control arm and then all members report about the family using a 30-item questionnaire.

In Part 1 I set out to simulate a dataset that met the following criteria:

  • 100 families with between 2 and 6 members each
  • Families assigned 1:1 to treatment or control
  • Responses to 30 ordinal items, 1-10
    a) Person-level: items correlated to measure one latent construct
    b) Family-level: scale scores correlated within families
    c) Group-level: scale scores higher in treatment arm

@erognli walked me through an approach that I modified for this purpose.

With a function to simulate the data in hand, I’m returning to the modeling question. I’m seeking approaches that would let me incorporate data from multiple family members in the estimation of the effect of the intervention (trtGrp) on the family functioning construct measured by the 30 ordinal items.

Constructing a scale score for each family member and then fitting a mixed model with individuals nested in family does not seem to get at the idea that the outcome is at the family level, but reported by individual family members:

brms::brm(scale_avg ~ trtGrp + (1 | family), data = df)

@erognli pointed me to this special issue, “A Dozen Years of Demonstrating That Informant Discrepancies are More Than Measurement Error: Toward Guidelines for Integrating Data from Multi-Informant Assessments of Youth Mental Health”.

I decided to open a second thread to work through some ideas with the data df generated by our new sim_data() function.

# new to Part 2: added step to make items ordinal after creating scale score

sim_data <- function(n_families = 100, 
                     range_family_size = c(2,6), 
                     dist_mu_family = c(0,1), 
                     dist_sigma_family = c(1,.2),
                     seed = NA) {
  
  require(simstudy); require(magrittr); require(tidyverse)
  
  ddef <- defData(id = 'family',
                  varname = 'n_members',
                  dist = 'uniformInt',
                  formula = paste(range_family_size[1], 
                                  ';', 
                                  range_family_size[2])) %>%
    
    defData(varname = 'mu_family',
            dist = 'normal',
            formula = dist_mu_family[1],
            variance = dist_mu_family[2]) %>%
    
    defData(varname = 'sigma_family',
            dist = 'gamma',
            formula = dist_sigma_family[1],
            variance = dist_sigma_family[2],
            link = 'identity')
  
  d <- genData(n = n_families,
               dtDefs = ddef)
  
  # ADDED
  d <- trtAssign(d, n = 2)
  
  d <- genCluster(dtClust = d,
                  cLevelVar = 'family',
                  numIndsVar = 'n_members',
                  level1ID = 'id')
  
  ddef <- defDataAdd(varname = 'latent_score',
                     dist = 'normal',
                     formula = 'mu_family + trtGrp*0.5', # ADDED
                     variance = 'sigma_family')
  
  d <- addColumns(dtDefs = ddef,
                  dtOld = d)
  
  # https://cran.r-project.org/web/packages/simstudy/vignettes/ordinal.html
  d <- genOrdCat(dtName = d,
                 adjVar = 'latent_score',
                 baseprobs = 
                   matrix(c(rep(c(0.05, 0.31, 0.23, 0.03, 0.03, 0.21, 0.03, 0.03, 0.03, 0.05), 10),
                            rep(c(0.38, 0.22, 0.14, 0.11, 0.02, 0.02, 0.02, 0.02, 0.05, 0.02), 10),
                            rep(c(0.08, 0.41, 0.16, 0.05, 0.05, 0.08, 0.08, 0.02, 0.02, 0.05), 10)
                   ),
                   nrow = 30,
                   byrow = T),
                 asFactor = F,
                 idname = 'id',
                 prefix = 'item_',
                 rho = 0.425,
                 corstr = 'ind')
  
  d <- d %>%
    select(family, n_members, trtGrp, id, starts_with("item")) %>%
    mutate(scale_avg = rowMeans(across(starts_with("item")))) %>%
    as_tibble() %>%
    mutate(across(starts_with("item"),
                  ~factor(.x,
                          levels = c(1:10),
                          ordered = TRUE))
    )
  
  return(d)
}
df <- sim_data(seed = 8675309)