Seeking examples of using brms to estimate treatment effects in experiment with multiple respondents per cluster

Eric_Green · February 26, 2024, 1:35pm

I want to simulate data and estimate the effect of a family-based intervention where multiple members of the family complete a 30-item instrument that can yield an overall score measuring family functioning. Families would be randomly assigned to treatment or wait-list control.

My preliminary search pointed me to work by Bauer et al on trifactor models and IRT. I’m seeking examples of this or competing approaches that would let me incorporate data from multiple family members in the estimation of the effect of the intervention on the family functioning score. Any tips?

Solomon · March 2, 2024, 3:26pm

Not quite what you’re looking for, but Wagenmakers et al (2016; https://doi.org/10.1177/1745691616674458) did a multisite experiment with a Likert-type DV. I have a content sketch here where I fit a few models to the data of one of the sites. If you used that as a starting point, you could fit a fuller model on the data from all the sites, which would be somewhat analogous to persons nested in families (participants nested in labs).

erognli · March 4, 2024, 9:16am

Interesting problem! I’ve done work using Stan om multi-informant data using IRT models, but not with brms.

If you haven’t already, I’d recommend looking into the literature on informant discrepancies . Multi-informant assessment of family variables is not theoretically trivial. Assuming that all differences informants can be interpreted as measurement error requires some strong assumptions and justification.

There was a special issue on the topic in the Journal of Clinical Child & Adolescent Psychology last year, which might be relevant: https://www.tandfonline.com/doi/full/10.1080/15374416.2022.2158843

Eric_Green · March 4, 2024, 11:36am

Thanks @Solomon and @erognli. I appreciate the pointers. Exploring them both.

I wonder if either of you might have better ideas about how to simulate this data. I’ve been poking around with simstudy but I’m finding it to be a complex challenge.

Trying to simulate:

100 families with between 2 and 6 members each
Families assigned 1:1 to treatment or control
Responses to 30 Likert-style items, 0-3
a) Person-level: items correlated to measure one latent construct
b) Family-level: scale scores correlated within families
c) Group-level: scale scores higher in treatment arm

I’d like to simulate datasets that vary on the number of families and the details for a, b, and c.

Here’s my attempt at the first part to create the family structure:

library(simstudy)
library(data.table)
set.seed(8675309)

# family level
  gen.family <- defData(varname = "icc", 
                        dist = "uniform", formula = '0;.5', 
                        id = "idFamily")
  gen.family <- defData(gen.family, varname = "nMembers", 
                        dist = "uniformInt", formula = '2;6')
  
  dtFamily <- genData(100, gen.family)
  dtFamily <- trtAssign(dtFamily, n = 2)

  dtFamily

Yields:

     idFamily          icc nMembers trtGrp
  1:        1 0.0797418174        4      0
  2:        2 0.2390941635        2      1
  3:        3 0.3823993483        4      1
  4:        4 0.3848438379        3      0
  5:        5 0.1342742486        3      1

 95:       95 0.0120131961        4      0
 96:       96 0.1154628171        2      1
 97:       97 0.2969644184        3      0
 98:       98 0.1177376050        4      1
 99:       99 0.2571975326        6      0
100:      100 0.3215165243        6      1
     idFamily          icc nMembers trtGrp

And the individual members:

# individuals
  dtMembers <- genData(sum(dtFamily$nMembers))
  
  idFamily <- data.table(idFamily = rep(dtFamily$idFamily, dtFamily$nMembers))
  trtGrp <- data.table(trtGrp = rep(dtFamily$trtGrp, dtFamily$nMembers))
  
  dtMembers <- cbind(dtMembers, idFamily)
  dtMembers <- cbind(dtMembers, trtGrp)
  
  dtMembers

Yields:

      id idFamily trtGrp
  1:   1        1      0
  2:   2        1      0
  3:   3        1      0
  4:   4        1      0
  5:   5        2      1
 ---                    
394: 394      100      1
395: 395      100      1
396: 396      100      1
397: 397      100      1
398: 398      100      1

Solomon · March 4, 2024, 3:20pm

I haven’t used {simstudy} before, and I haven’t simulated ordinal data of that complexity before. Other’s will have to chime in. My best recommendation is to start small and build (e.g., first with one item and no nesting, then with 30 items and no other nesting, and so on).

Eric_Green · March 4, 2024, 8:31pm

OK, here’s where I got.

Generate nested members (2-6) in 100 families, assign families to treatment or control
Generate 30 correlated ordinal items with a shift (effect) to make the treatment group’s sum scores lower on average
Simulate this 100 times

library(tidyverse)
library(simstudy)
library(data.table)
library(brms)
set.seed(8675309)

#' Simulate data
#' @param seed  simulation seed
#' @param N_fam total families
#' @param shift adjustment variable name in dtName - determines logistic shift
#' @param items number of items in scale
#' @param rho correlation coefficient, -1 < rho < 1.

  sim_d <- function(seed, N_fam, shift, items, rho) {
    
    set.seed(seed)
    
    # family level
      def_family <- simstudy::defData(varname = "nMembers", 
                                      dist = "uniformInt", formula = '2;6',
                                      id = "idFamily")
      
      dtFamily <- simstudy::genData(N_fam, def_family)
      dtFamily <- simstudy::trtAssign(dtFamily, n = 2)
      
      dtFamily <- dtFamily %>% 
        dplyr::mutate(z = if_else(trtGrp==1, shift, 0))
    
    # individuals
      dtMembers <- simstudy::genData(sum(dtFamily$nMembers))
      
      idFamily <- data.table(idFamily = rep(dtFamily$idFamily, dtFamily$nMembers))
      trtGrp <- data.table(trtGrp = rep(dtFamily$trtGrp, dtFamily$nMembers))
      z <- data.table(z = rep(dtFamily$z, dtFamily$nMembers))
      
      dtMembers <- cbind(dtMembers, idFamily)
      dtMembers <- cbind(dtMembers, trtGrp)
      dtMembers <- cbind(dtMembers, z)
      
      dtMembers
      
    # items
    # https://kgoldfeld.github.io/simstudy/articles/ordinal.html
      baseprobs <- matrix(rep(c(0.6, 0.25, 0.1, 0.05), items), 
                          nrow = items, byrow = TRUE)
      
      # z defined in family step above
      dtMembers <- simstudy::genOrdCat(dtMembers, 
                                       adjVar = "z", 
                                       baseprobs = baseprobs, 
                                       prefix = "q", 
                                       rho = rho, 
                                       corstr = "cs", 
                                       asFactor = FALSE) %>%
      # make possible item scores 0-3 (from 1-4)
        dplyr::mutate_at(vars(starts_with("q")), ~.x-1) %>%
        dplyr::rowwise() %>% 
        dplyr::mutate(sum_score = sum(c_across(starts_with("q")), na.rm = T)) %>%
        ungroup()
  }

# simulate and fit
  n_sim <- 100
  s <- tibble(seed = 1:n_sim) %>% 
    mutate(d = map(seed, sim_d, 
                   N_fam = 100,
                   shift = -0.3,
                   items = 30,
                   rho = 0.15)) %>% 
    mutate(fit = map2(d, seed, ~ brm(
      bf(sum_score ~ 1 + trtGrp + (1 | idFamily)),
      backend = "cmdstanr",
      data = .x,
      cores = parallel::detectCores()
  )) 
  )
          
  s %>%
    mutate(sd = map(.x = d, ~ sd(.x$sum_score))) %>%
    unnest(sd) %>%
    mutate(est = map(.x = fit, ~fixef(.x)["trtGrp", 1])) %>%
    unnest(est) %>%
    mutate(cd = est/sd) %>%
    ggplot(aes(x=cd)) +
      geom_histogram()

The data generating mechanism does not make items (and thus scores) correlated at the family level. The items themselves are correlated for each individual, but the family nesting is not represented.

Anyone have other ideas about how to simulate?

Of course the original question is still lurking. How to model when data come from multiple reporters. In the simulation I just represent the nesting by family and use the sum scores as the dv.

erognli · March 5, 2024, 10:28am

I think I’d take a step back and think carefully about what you have measured and what effect you are expecting to find. You could simply generate parameters for a normal distribution of latent variables per family (with a shift of means for intervention or separate distributions for intervention and control groups), draw the latent variables for each individual from that, and then generate the item level data given the latent variable of each member and an ordinal model for each item, but whether that is a good idea depends on your assumptions about what you are measuring.

“Family functioning” is a very broad construct, and I don’t know what measure you have used, but looking hard at the items and thinking about the response process might be useful. Is this a reflective or a formative measure? And what are the indicators used in the items? If the different members of the family are responding to generalised questions about the ways the family interact (“We usually apologize after quarrels”), differences among them might reflect different recall of examples, different experiences in the subsystems/dyads composing the family, or different perceptions of the same situations. If they are responding to questions about more verifiable or objective markers of family functioning (“We usually eat dinner together four days of the week or more”), differences might to a greater degree represent measurement error from a common truth.

These issues may have implications for the simulations, and will definitely impact the interpretation of any analysis, so I think it’s important to think carefully about them (and maybe you have!).

Eric_Green · March 5, 2024, 2:14pm

Thanks for the thoughtful and helpful reply, @erognli.

I had not come across the terms “reflective” and “formative” before. If I’ve got it right, it’s what I understand to be the distinction between a scale and an index. For instance, items on a depression scale might be thought to be caused by the latent construct of depression (and thus the intercorrelations matter), whereas stocks in the S&P 500 drive the value of the index (and we don’t care about intercorrelations between stocks). Do I have this right?

The items are like your first example. Similar to the work on child behavior assessment that incorporates ratings from the parent, teacher, and child. Previously we’ve set aside the challenge and obtained one parent’s perspective, but this simulation is an attempt to move toward a more comprehensive approach to modeling.

You could simply generate parameters for a normal distribution of latent variables per family (with a shift of means for intervention or separate distributions for intervention and control groups), draw the latent variables for each individual from that, and then generate the item level data given the latent variable of each member and an ordinal model for each item

You had me at “simply” 😆. Could you share any guidance for doing this? I think it would help me across a range of situations, including this one.

erognli · May 9, 2024, 7:31pm

@Eric_Green - I read your message at a very busy time, and forgot to follow up when it no longer showed as unread (and I’ve rarely logged in to the forum lately). I’m very sorry for not replying. Hope you already solved the problem, and that it’s no longer useful, but here’s an example function anyway.

It simulates families of different sizes with latent scores for each individual drawn from a common distribution for the family, which itself has parameters drawn from a distribution across all families. It uses the clustering function of the simstudy package, which has been added somewhat recently, I think? If needed, you could add an item response style element to this, to simulate item scores instead of latent variables.

sim_data <- function(n_families = 100, 
                     range_family_size = c(3,6), 
                     dist_mu_family = c(0,1), 
                     dist_sigma_family = c(1,.2)) {
  
  require(simstudy); require(magrittr)
  
  ddef <- defData(id = 'family',
                  varname = 'n_members',
                  dist = 'uniformInt',
                  formula = paste(range_family_size[1], 
                                  ';', 
                                  range_family_size[2])) %>%
    
    defData(varname = 'mu_family',
            dist = 'normal',
            formula = dist_mu_family[1],
            variance = dist_mu_family[2]) %>%
    
    defData(varname = 'sigma_family',
            dist = 'gamma',
            formula = dist_sigma_family[1],
            variance = dist_sigma_family[2],
            link = 'identity')
  
  d <- genData(n = n_families,
               dtDefs = ddef)
  
  d <- genCluster(dtClust = d,
                  cLevelVar = 'family',
                  numIndsVar = 'n_members',
                  level1ID = 'id')
  
  ddef <- defDataAdd(varname = 'latent_score',
                     dist = 'normal',
                     formula = 'mu_family',
                     variance = 'sigma_family')
  
  d <- addColumns(dtDefs = ddef,
                  dtOld = d)
  
  return(d)
}

Eric_Green · May 21, 2024, 4:40pm

This is really helpful, @erognli. Thank you for taking time to return to this request and help me get on the right track.

I tried adding a treatment group indicator d <- trtAssign(d, n = 2) and modified the ddef formula to 'mu_family + trtGrp*0.5' to modify mu based on assignment. Not sure I did it quite right, but I think this is the basic idea.

Next up on my learning goals is to figure out how to:

generate the item level data given the latent variable of each member and an ordinal model for each item

# revised attempt

set.seed(8675309)

sim_data <- function(n_families = 100, 
                     range_family_size = c(3,6), 
                     dist_mu_family = c(0,1), 
                     dist_sigma_family = c(1,.2)) {
  
  require(simstudy); require(magrittr)
  
  ddef <- defData(id = 'family',
                  varname = 'n_members',
                  dist = 'uniformInt',
                  formula = paste(range_family_size[1], 
                                  ';', 
                                  range_family_size[2])) %>%
    
  defData(varname = 'mu_family',
          dist = 'normal',
          formula = dist_mu_family[1],
          variance = dist_mu_family[2]) %>%
    
  defData(varname = 'sigma_family',
          dist = 'gamma',
          formula = dist_sigma_family[1],
          variance = dist_sigma_family[2],
          link = 'identity')
  
  d <- genData(n = n_families,
               dtDefs = ddef)
  
# ADDED
  d <- trtAssign(d, n = 2)
  
  d <- genCluster(dtClust = d,
                  cLevelVar = 'family',
                  numIndsVar = 'n_members',
                  level1ID = 'id')
  
  ddef <- defDataAdd(varname = 'latent_score',
                     dist = 'normal',
                     formula = 'mu_family + trtGrp*0.5', # ADDED
                     variance = 'sigma_family')
  
  d <- addColumns(dtDefs = ddef,
                  dtOld = d)
  
  return(d)
}

erognli · May 31, 2024, 7:24am

Ok, glad it was helpful!

So for simulating item level data you could use the genOrdCat-function of the simstudy package. Here is an example, with three items scored on a 1-3 scale. I think it’s a graded response IRT model in practice, where I’ve assumed independence of items conditional on the latent scores (which may not be appropriate - I’m not a psychometrician, so I’m at the edge of my competencies here).

sim_data <- function(n_families = 100, 
                     range_family_size = c(3,6), 
                     dist_mu_family = c(0,1), 
                     dist_sigma_family = c(1,.2)) {
  
  require(simstudy); require(magrittr)
  
  ddef <- defData(id = 'family',
                  varname = 'n_members',
                  dist = 'uniformInt',
                  formula = paste(range_family_size[1], 
                                  ';', 
                                  range_family_size[2])) %>%
    
    defData(varname = 'mu_family',
            dist = 'normal',
            formula = dist_mu_family[1],
            variance = dist_mu_family[2]) %>%
    
    defData(varname = 'sigma_family',
            dist = 'gamma',
            formula = dist_sigma_family[1],
            variance = dist_sigma_family[2],
            link = 'identity')
  
  d <- genData(n = n_families,
               dtDefs = ddef)
  
  d <- genCluster(dtClust = d,
                  cLevelVar = 'family',
                  numIndsVar = 'n_members',
                  level1ID = 'id')
  
  ddef <- defDataAdd(varname = 'latent_score',
                     dist = 'normal',
                     formula = 'mu_family',
                     variance = 'sigma_family')
  
  d <- addColumns(dtDefs = ddef,
                  dtOld = d)
  
  
  d <- genOrdCat(dtName = d,
                 adjVar = 'latent_score',
                 baseprobs = matrix(c(.8, .15, .05,
                                      .34, .33, .33,
                                      .05, .15, .8),
                                    nrow = 3,
                                    byrow = T),
                 asFactor = F,
                 idname = 'id',
                 prefix = 'item_',
                 rho = 0,
                 corstr = 'ind')
 
  return(d) }

Eric_Green · June 12, 2024, 6:13am

This is very very helpful, @erognli! Mission accomplished!!

sim_data <- function(n_families = 100, 
                     range_family_size = c(2,6), 
                     dist_mu_family = c(0,1), 
                     dist_sigma_family = c(1,.2),
                     seed = NA) {
  
  require(simstudy); require(magrittr)
  
  ddef <- defData(id = 'family',
                  varname = 'n_members',
                  dist = 'uniformInt',
                  formula = paste(range_family_size[1], 
                                  ';', 
                                  range_family_size[2])) %>%
    
  defData(varname = 'mu_family',
          dist = 'normal',
          formula = dist_mu_family[1],
          variance = dist_mu_family[2]) %>%
    
  defData(varname = 'sigma_family',
          dist = 'gamma',
          formula = dist_sigma_family[1],
          variance = dist_sigma_family[2],
          link = 'identity')
  
  d <- genData(n = n_families,
               dtDefs = ddef)
  
# ADDED
  d <- trtAssign(d, n = 2)
  
  d <- genCluster(dtClust = d,
                  cLevelVar = 'family',
                  numIndsVar = 'n_members',
                  level1ID = 'id')
  
  ddef <- defDataAdd(varname = 'latent_score',
                     dist = 'normal',
                     formula = 'mu_family + trtGrp*0.5', # ADDED
                     variance = 'sigma_family')
  
  d <- addColumns(dtDefs = ddef,
                  dtOld = d)
  
# https://cran.r-project.org/web/packages/simstudy/vignettes/ordinal.html
  d <- genOrdCat(dtName = d,
                 adjVar = 'latent_score',
                 baseprobs = 
                   matrix(c(rep(c(0.05, 0.31, 0.23, 0.03, 0.03, 0.21, 0.03, 0.03, 0.03, 0.05), 10),
                            rep(c(0.38, 0.22, 0.14, 0.11, 0.02, 0.02, 0.02, 0.02, 0.05, 0.02), 10),
                            rep(c(0.08, 0.41, 0.16, 0.05, 0.05, 0.08, 0.08, 0.02, 0.02, 0.05), 10)
                            ),
                          nrow = 30,
                          byrow = T),
                 asFactor = F,
                 idname = 'id',
                 prefix = 'item_',
                 rho = 0.425,
                 corstr = 'ind')
  
  return(d)
}

This gives me the ability to simulate a dataset that meets the original goals!

100 families with between 2 and 6 members each

Families assigned 1:1 to treatment or control

Responses to 30 ordinal items, 1-10 (formerly 0-3)
a) Person-level: items correlated to measure one latent construct
b) Family-level: scale scores correlated within families
c) Group-level: scale scores higher in treatment arm

df <- sim_data(seed = 8675309) %>%
  dplyr::select(family, n_members, trtGrp, id, starts_with("item")) %>%
  mutate(scale_avg = rowMeans(across(starts_with("item")))) %>%
  as_tibble()

glimpse(df)

Rows: 372
Columns: 35
$ family    <int> 1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 8…
$ n_members <int> 3, 3, 3, 2, 2, 3, 3, 3, 3, 3, 3, 2, 2, 5, 5, 5, 5, 5, 4, 4, 4, 4, 3…
$ trtGrp    <int> 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0…
$ id        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, …
$ item_01   <int> 2, 8, 2, 2, 6, 1, 6, 2, 6, 2, 10, 2, 3, 2, 6, 6, 10, 2, 9, 6, 7, 7,…
$ item_02   <int> 6, 9, 3, 3, 2, 2, 4, 1, 8, 9, 8, 9, 2, 1, 8, 1, 7, 3, 2, 7, 7, 2, 7…
$ item_03   <int> 6, 10, 7, 2, 6, 4, 9, 2, 3, 6, 3, 6, 6, 2, 3, 2, 6, 2, 6, 2, 7, 6, …
$ item_04   <int> 7, 9, 2, 1, 3, 2, 4, 5, 2, 3, 10, 2, 10, 2, 3, 2, 3, 2, 6, 2, 10, 2…
$ item_05   <int> 2, 6, 1, 2, 10, 2, 10, 3, 3, 6, 6, 9, 3, 2, 3, 2, 2, 1, 7, 3, 6, 10…
$ item_06   <int> 6, 6, 3, 2, 10, 2, 6, 2, 9, 6, 3, 10, 4, 2, 1, 2, 3, 2, 10, 10, 6, …
$ item_07   <int> 3, 6, 6, 1, 2, 2, 6, 2, 9, 10, 10, 10, 2, 2, 9, 2, 7, 3, 8, 7, 6, 3…
$ item_08   <int> 3, 6, 3, 3, 2, 2, 1, 2, 6, 6, 7, 6, 6, 5, 2, 6, 10, 2, 10, 6, 9, 3,…
$ item_09   <int> 6, 3, 6, 2, 7, 2, 3, 3, 3, 6, 6, 3, 3, 2, 2, 6, 3, 2, 7, 9, 9, 2, 2…
$ item_10   <int> 2, 6, 3, 2, 6, 4, 6, 4, 3, 6, 8, 10, 3, 2, 2, 5, 6, 3, 2, 2, 6, 6, …
$ item_11   <int> 2, 10, 4, 2, 4, 6, 3, 2, 3, 4, 4, 9, 8, 1, 2, 1, 9, 3, 9, 1, 8, 3, …
$ item_12   <int> 2, 10, 1, 1, 3, 1, 4, 1, 1, 1, 1, 6, 2, 1, 4, 1, 1, 1, 4, 9, 8, 6, …
$ item_13   <int> 1, 1, 3, 2, 9, 2, 4, 1, 2, 2, 6, 7, 9, 1, 2, 2, 4, 1, 4, 3, 4, 1, 1…
$ item_14   <int> 2, 1, 1, 1, 10, 3, 3, 1, 4, 2, 7, 5, 1, 2, 1, 2, 4, 9, 1, 6, 6, 1, …
$ item_15   <int> 3, 9, 1, 1, 9, 1, 1, 1, 1, 9, 4, 5, 1, 1, 1, 2, 3, 1, 1, 10, 2, 3, …
$ item_16   <int> 9, 2, 1, 1, 1, 1, 3, 2, 2, 4, 1, 8, 8, 1, 1, 4, 10, 3, 4, 4, 10, 1,…
$ item_17   <int> 2, 2, 1, 1, 4, 1, 4, 1, 1, 1, 5, 3, 1, 1, 3, 1, 1, 2, 8, 2, 5, 1, 4…
$ item_18   <int> 9, 2, 1, 1, 1, 1, 3, 1, 3, 9, 9, 3, 4, 1, 1, 1, 1, 2, 1, 3, 1, 3, 4…
$ item_19   <int> 2, 10, 1, 1, 7, 2, 6, 6, 2, 9, 3, 10, 1, 2, 1, 1, 2, 1, 4, 8, 4, 1,…
$ item_20   <int> 9, 9, 1, 1, 4, 3, 9, 2, 9, 3, 4, 3, 1, 1, 1, 3, 4, 1, 10, 3, 3, 1, …
$ item_21   <int> 2, 7, 2, 2, 2, 2, 2, 2, 3, 2, 2, 7, 1, 2, 3, 3, 10, 2, 10, 7, 2, 2,…
$ item_22   <int> 3, 5, 2, 2, 10, 2, 6, 2, 5, 10, 7, 4, 3, 1, 2, 2, 1, 4, 6, 2, 2, 2,…
$ item_23   <int> 10, 7, 2, 1, 4, 2, 10, 6, 3, 10, 4, 2, 5, 6, 2, 6, 5, 2, 2, 3, 3, 1…
$ item_24   <int> 2, 9, 1, 2, 3, 4, 8, 4, 2, 7, 10, 2, 1, 1, 2, 3, 2, 2, 10, 6, 4, 2,…
$ item_25   <int> 3, 7, 1, 2, 3, 2, 2, 6, 2, 6, 7, 7, 1, 2, 2, 2, 2, 2, 2, 10, 5, 5, …
$ item_26   <int> 2, 2, 2, 1, 10, 2, 7, 6, 3, 5, 7, 2, 10, 2, 2, 10, 4, 7, 6, 7, 3, 3…
$ item_27   <int> 7, 5, 2, 2, 2, 1, 9, 3, 1, 10, 5, 8, 1, 2, 2, 2, 7, 2, 7, 3, 4, 1, …
$ item_28   <int> 2, 7, 2, 1, 10, 2, 2, 6, 10, 6, 7, 10, 2, 1, 1, 6, 7, 5, 7, 2, 1, 2…
$ item_29   <int> 8, 8, 2, 1, 7, 1, 10, 3, 2, 2, 4, 4, 5, 8, 2, 1, 3, 2, 5, 2, 5, 1, …
$ item_30   <int> 10, 7, 3, 1, 3, 2, 3, 1, 2, 7, 5, 2, 4, 10, 4, 7, 7, 3, 4, 4, 6, 2,…
$ scale_avg <dbl> 4.433333, 6.300000, 2.333333, 1.566667, 5.333333, 2.133333, 5.13333…

Receipts…

# 100 families...
# ----------------------------------------------------------------------
df %>% 
  distinct(family) %>% 
  nrow()
# 100

# ...with between 2 and 6 members each
# ----------------------------------------------------------------------
df %>% 
  group_by(family) %>% 
  count() %>% 
  ungroup() %>% 
  summarize(min = min(n), max=max(n))

## A tibble: 1 × 2
#    min   max
#  <int> <int>
#1     2     6

# Families assigned 1:1 to treatment or control
# ----------------------------------------------------------------------
df %>% 
  distinct(family, .keep_all = TRUE) %>% 
  group_by(trtGrp) %>% 
  count()

## A tibble: 2 × 2
## Groups:   trtGrp [2]
#  trtGrp     n
#   <int> <int>
#1      0    50
#2      1    50

# Person-level: items correlated to measure one latent construct
performance::item_intercor(df %>% select(starts_with("item")), method = "pearson")

# 0.3426369

# Family-level: scale scores correlated within families
# ----------------------------------------------------------------------
fit <- lme4::lmer(scale_avg ~ trtGrp + (1 | family), data = df)
performance::icc(fit)

# Adjusted ICC: 0.420

# Group-level: scale scores higher in treatment arm
# ----------------------------------------------------------------------
parameters::model_parameters(fit, effects = "fixed", ci_method = "satterthwaite")

Parameter   | Coefficient |   SE |       95% CI |     t |    df |      p
------------------------------------------------------------------------
(Intercept) |        3.63 | 0.19 | [3.26, 4.00] | 19.51 | 93.09 | < .001
trtGrp      |        0.71 | 0.27 | [0.18, 1.24] |  2.67 | 95.72 | 0.009

Topic		Replies	Views
Part 2: Seeking examples of using brms to estimate treatment effects in experiment with multiple respondents per cluster Modeling brms	0	105	June 12, 2024
Help with (partially?) nested data/model brms specification , hierarchical-model , brms	3	1461	March 16, 2022
Want to confirm my data is in the correct structure for brms regression given output Modeling brms	1	288	January 10, 2024
Two-Tier or Testlet-IRT Model in brms brms specification	18	1273	February 10, 2024
Ordinal Mixture Model w/ Set Outcome for One Component brms	28	2529	February 17, 2023

Seeking examples of using brms to estimate treatment effects in experiment with multiple respondents per cluster

Related topics