Shape parameter convergence issues with multilevel models

willte · May 17, 2018, 11:57pm

When attempting to vary the shape parameter in multilevel models, I seem to be getting extreme Rhats and very low effective samples. This often then transpires in extreme predictions and ones that do not appear to relate well to the data on which they’re trained.

To try to diagnose the root of the problem, I’ve created simplified versions of my problem and still run in to the same issues. There could easily be something I’m doing wrong with the way I’m constructing the model.

Here is a reproducible example, which I’ve tried to make as simple as possible, and one where I wouldn’t expect convergence issues:

n <- 100
Group1 <- data.frame(Group = "Group1", y = round(rnbinom(n, size = 10, mu = 20)))
Group2 <- data.frame(Group = "Group2", y = round(rnbinom(n, size = 1, mu = 15)))
df <- rbind(Group1, Group2)
df$x1 <- df$y + rnorm(n, 5, 3)
mod <- brm(bf(y ~ x1 + (1|Group),
              shape ~ (1|Group)),
              family = negbinomial(), data = df)

summary(mod) then typically displays effective sample sizes of approximately 2 and Rhats > 1000, across all parameters.

If I do not let the shape parameter vary:

mod <- brm(y ~ x1 + (1|Group))
              family = negbinomial(), data = df)

I do not get the same convergence problems.

What could I be doing wrong? I’ve tried referring to this vignette but without solving my problem.

Operating System: Windows 10 Home
brms Version: 2.2.0

paul.buerkner · May 18, 2018, 7:32am

Since you have only 2 groups, a group-level formulation will likely cause some trouble without carefully set priors. I would probably just go for

bf(y ~ x1 + Group, shape ~ Group)

Does that already solve your problem?

willte · May 18, 2018, 7:44am

Unfortunately not, I also tried 3 groups and same problem.

The actual dataset in which I originally encountered this problem also has 100+ groups, each with 30+ samples.

I haven’t however looked into setting priors (except for the default uninformative ones), but instinctively I don’t have a lot of prior information that I could add to the model.

It just surprises me that the model doesn’t even come close to converging as it is.

jonah · May 18, 2018, 5:56pm

I think you probably know a bit more than you think. Are parameter estimates on the order of a trillion equally plausible a priori as smaller estimates? Even encoding “weak” information like that in your prior can help. Essentially you’re softly limiting the search space to regions that aren’t a priori absurd, which can have noticeable computational benefits.

willte · May 20, 2018, 3:16am

Thanks Jonah, it sounds like it might have something to do with the default priors giving too much weight to values that will generate extreme dispersion values in the negative binomial distribution. As described halfway down this page: https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations

I know very little about the process for selecting appropriate priors, and my limited stats background makes it hard for me to understand a lot of the supporting text, but if I find a solution and/or prior that enables the above reproducible example to converge I’ll post it here.

Topic		Replies	Views
Fitting Multilevel Wave Models Modeling fitting-issues , brms	14	495	November 4, 2023
Advice on model structure brms	3	596	October 30, 2019
Issues with convergence for latent variable model in brms Modeling brms	9	1133	June 7, 2022
Convergence issues with brms mixture models Modeling brms	6	1321	May 18, 2022
Posterior dimensions and object sizes differ between identical models fit with different brms versions Interfaces brms	4	262	February 23, 2023

Shape parameter convergence issues with multilevel models

Related Topics