OK, here’s where I got.

- Generate nested members (2-6) in 100 families, assign families to treatment or control
- Generate 30 correlated ordinal items with a shift (effect) to make the treatment group’s sum scores lower on average
- Simulate this 100 times

```
library(tidyverse)
library(simstudy)
library(data.table)
library(brms)
set.seed(8675309)
#' Simulate data
#' @param seed simulation seed
#' @param N_fam total families
#' @param shift adjustment variable name in dtName - determines logistic shift
#' @param items number of items in scale
#' @param rho correlation coefficient, -1 < rho < 1.
sim_d <- function(seed, N_fam, shift, items, rho) {
set.seed(seed)
# family level
def_family <- simstudy::defData(varname = "nMembers",
dist = "uniformInt", formula = '2;6',
id = "idFamily")
dtFamily <- simstudy::genData(N_fam, def_family)
dtFamily <- simstudy::trtAssign(dtFamily, n = 2)
dtFamily <- dtFamily %>%
dplyr::mutate(z = if_else(trtGrp==1, shift, 0))
# individuals
dtMembers <- simstudy::genData(sum(dtFamily$nMembers))
idFamily <- data.table(idFamily = rep(dtFamily$idFamily, dtFamily$nMembers))
trtGrp <- data.table(trtGrp = rep(dtFamily$trtGrp, dtFamily$nMembers))
z <- data.table(z = rep(dtFamily$z, dtFamily$nMembers))
dtMembers <- cbind(dtMembers, idFamily)
dtMembers <- cbind(dtMembers, trtGrp)
dtMembers <- cbind(dtMembers, z)
dtMembers
# items
# https://kgoldfeld.github.io/simstudy/articles/ordinal.html
baseprobs <- matrix(rep(c(0.6, 0.25, 0.1, 0.05), items),
nrow = items, byrow = TRUE)
# z defined in family step above
dtMembers <- simstudy::genOrdCat(dtMembers,
adjVar = "z",
baseprobs = baseprobs,
prefix = "q",
rho = rho,
corstr = "cs",
asFactor = FALSE) %>%
# make possible item scores 0-3 (from 1-4)
dplyr::mutate_at(vars(starts_with("q")), ~.x-1) %>%
dplyr::rowwise() %>%
dplyr::mutate(sum_score = sum(c_across(starts_with("q")), na.rm = T)) %>%
ungroup()
}
# simulate and fit
n_sim <- 100
s <- tibble(seed = 1:n_sim) %>%
mutate(d = map(seed, sim_d,
N_fam = 100,
shift = -0.3,
items = 30,
rho = 0.15)) %>%
mutate(fit = map2(d, seed, ~ brm(
bf(sum_score ~ 1 + trtGrp + (1 | idFamily)),
backend = "cmdstanr",
data = .x,
cores = parallel::detectCores()
))
)
s %>%
mutate(sd = map(.x = d, ~ sd(.x$sum_score))) %>%
unnest(sd) %>%
mutate(est = map(.x = fit, ~fixef(.x)["trtGrp", 1])) %>%
unnest(est) %>%
mutate(cd = est/sd) %>%
ggplot(aes(x=cd)) +
geom_histogram()
```

The data generating mechanism does not make items (and thus scores) correlated at the family level. The items themselves are correlated for each individual, but the family nesting is not represented.

Anyone have other ideas about how to simulate?

Of course the original question is still lurking. How to model when data come from multiple reporters. In the simulation I just represent the nesting by family and use the sum scores as the dv.