(Moving this over from a brms issue at @paul.buerkner recommendation)
I am trying to fit this multilevel model using brms
2.22.0 and backend cmdstanr
2.36.0:
y ~ x_1 +... x_K + (x_1 + ... + x_K || g1 * g2)
However, when K = 32 the model does not compile after 15 minutes. Note the following cases do compile:
- replace
||
with|
. However with K=32 I do not want to fit a 32 x 32 covariance matrix - K = 2
- replacing g1 * g2 with just g1
Looking at the generated stan code-- when switching from |
to ||
, things that were vectorized get unvectorized. For example the priors on the random effects involve 32+1 calls to std_normal_lpdf
It’s not clear to me why this takes forever. It is complicated but not that complicated? Even the vectorized |
model is slow to compile as @paul.buerkner points out. And why is the ||g1
model fine?
Here’s a script to generate fake data, print the model, and compile:
N = 100 # number of observations
K = 32 # number of predictors
J1 = 2 # number of groups g1
J2 = 2 # number of groups g2
df = as.data.frame(matrix(rnorm(N*K), nrow=N))
x_names = paste0("x_", 1:K)
colnames(df) = x_names
df$g1 = factor(sample(1:J1, N, replace=TRUE))
df$g2 = factor(sample(1:J2, N, replace=TRUE))
# y has no relationship to x and g, just fake data for compiling model
df$y = rnorm(N)
sum_x_str = paste(x_names, collapse="+")
formula = as.formula(sprintf("y ~ %s + (%s||g1*g2)", sum_x_str, sum_x_str))
brms::make_stancode(formula, df, backend="cmdstanr", chains=0)
brms::brm(formula, df, backend="cmdstanr", chains=0)