Divergent transitions with multilevel model with 3 groups

Operating System: Windows 10
Interface Version: R versions 3.4.4/ rstanarm 2.17.3 (also tried brms)

I’m trying to fit a basic multilevel model where I have data from three groups (about 35 per group) and the outcome variable consists of values falling between 0 and 1.0:

stan_glmer(outcome ~ (1|group), data=data, control=list(adapt_delta=.999), family = mgcv::betar)

While I expected this simple model to work, I get a warning message that there were divergent transitions. I then used the pairs () plot to try and diagnose things:


I can see that most of the divergences are occurring within a certain area, but I’m still unsure how to address these divergences. Thus far I’ve tried (a) increasing adapt_delta to .99 (and even higher (b) fitting the model in brms, which I’ve read does non-centered parameterization and © using a bunch of different priors [e.g., ranging from N(0,.1) to N(0,10)] for the group variance parameter - including very informative ones as I only have 3 groups and (d) adjusting the max tree-depth.

Are there any other avenues I should pursue?

Hmm, hard to say. Are you able to share your data? Or if not, some simulated data similar enough that it also demonstrates the problem? That would definitely help us diagnose the problem.

Oh, and rstanarm does the non-centered parameterization also, so these results you’re getting are from the non-centered parameterization. It’s conceivable that using the the centered parameterization is better for this data (if the data are very informative about the parameters) but it’s hard to say without having the data or similar enough data.

And one minor coding thing: as opposed to rstan::stan with stan_glmer we save you the trouble of having to put adapt_delta in the control list. You can specify it like any other argument. It should work either way though.

@jonah, sure here’s some data that’s similar and still demonstrates the problem: data.csv .

Also, thanks for letting me know that rstanarm uses non-centered parameterization. I thought it might but I wasn’t quite sure. Also cool to know I dont’ have to put adapt_delta in the control list. Thanks!

Thanks for sharing the data.

One issue may be that the standard deviation of the group-level parameters gets too close to zero occasionally. You could try specifying a prior for that parameter that pulls it away from zero a bit. I would try adding prior_covariance = decov(shape = 5) to your call to stan_glmer (doesn’t have to be 5, but try increasing above 1). This is equivalent to specifying a gamma prior with a shape parameter of 5. The default is a shape parameter of 1. Here’s a comparison of the default prior (blue) and the prior with shape parameter equal to 5 (orange):


You can see that after increasing the shape parameter we get a distribution that puts much less probability mass near zero, which will hopefully help avoid the issue.

When I try this approach using the data you sent I am able to get rid of the divergences.

Thanks for taking a look for me. I too find that that gamma prior with shape 5 gets rid of divergent transitions. However, summarizing the model makes me a little unsure if this is helpful. Basically the the group variance parameter blows up:

Here is a plot of the posterior densities which show the same thing:

Changing the shape parameter to a small value helps a bit, but at values as low as 2 I still find divergent transitions and I still find some pretty extreme values:

In total, I’m wondering if this data is just not suited for a multilevel model? I was hoping to fit one to get some partial pooling, but maybe it’s not possible.