Categorical multi-level model performance

  • Operating System: Windows 10
  • brms Version: 2.9.0

Hi,

I’m trying to run a multi-level analysis with a 3 category outcome variable. However, even with the null model, the analysis runs slowly, about 30 minutes for one chain on my machine, and much longer when predictors, etc. are introduced.

I’ve attached the relevant data, and the analysis can be run with:

data <- read.csv("testdata.csv")

m <- brm(data = data,
              family = categorical,
              dv ~ 1 + (1 | grp),
              chains = 1
)

This obviously uses the default priors, I have tried some different options (including gamma, normal, and different parameters to student_t), none of which have helped performance. The results (regardless of priors) look reasonable, and rhat/effective samples looks OK. Plot of results with 4 chains:

I am new to Bayesian analysis, so it’s highly likely I’m doing something wrong. In particular, I’m not confident I’ve used reasonable priors. I’d appreciate any help you can give me with this.

Thanks.

testdata.csv (158.0 KB)

Categorical models are hard to fit, and sampling speed may be improved by weakly informative priors on the regression coefficients. For instance prior(normal(0, 5), class = "b") could help.

That said, you seem to have over 20k observations, in which case longer fitting times of Bayesian models is very plausible and may be something you have to get used to.

Thanks. I can deal with it taking a while to run, I just was unsure whether something could be done about it, or whether it might indicate something going wrong.

Thanks a lot for your help.