Divergent transitions after warmup using brms


I’m encountering difficulties concerning divergent transitions in brms (system information is pasted at the end of the post).

I have 2 data sets (dat_c.txt (13.9 KB) and dat_v.txt (14.3 KB)) that stem from the same experiment, in which 3 extremely-rare-to-find participants took part (subject_id) and 16 items were included (item_id). There are 3 independent variables (var_c, var_p, var_m), each with 2 levels coded as 1 and -1. The dependent variable (dv) is the duration of some kind of a phoneme, but it is not the same type of phoneme in both data sets, which is why I want to fit two separate models, one for each dependent variable.

Since I don’t have much data, my intention initially was to estimate only random intercepts for participants and items, besides the fixed effects, like here:

# random intercepts model for dat_c
fit_c_ranInt <- brm(dv ~ var_c * var_p * var_m + (1 | subject_id) + (1 | item_id),
                    data=dat_c, chains=4, iter=3000,
                            set_prior("normal(0,50)", class="Intercept"),
                     control=list(adapt_delta=0.99, max_treedepth=15, stepsize=.001))

# random intercepts model for dat_v
fit_v_ranInt <- brm(dv ~ var_c * var_p * var_m + (1 | subject_id) + (1 | item_id),
                    data=dat_v, chains=4, iter=3000,
                            set_prior("normal(0,50)", class="Intercept"),
                     control=list(adapt_delta=0.99, max_treedepth=15, stepsize=.001))

Now, when I run the fit_c_ranInt model, there are 4 divergent transitions. But when I fit a more complex model, with the most complex random effects structure allowed by the experimental design (fit_c_complex; pasted below), no divergent transitions occur and everything is fine.

By contrast, the random-intercepts model for the other dependent variable (fit_v_ranInt) produces 27 divergent transitions, whereas the more complex fit_v_complex model produces 6 divergent transitions.

# complex model for dat_c
fit_c_complex <- brm(dv ~ var_c * var_p * var_m + (var_c * var_p * var_m | subject_id) + (var_c * var_m | item_id),
             data=dat_c, chains=4, iter=3000,
                     set_prior("normal(0,50)", class="Intercept"),
             control=list(adapt_delta=0.99, max_treedepth=15, stepsize=.001))

# complex model for dat_v
fit_v_complex <- brm(dv ~ var_c * var_p * var_m + (var_c * var_p * var_m | subject_id) + (var_c * var_m | item_id),
             data=dat_v, chains=4, iter=3000,
                     set_prior("normal(0,50)", class="Intercept"),
             control=list(adapt_delta=0.99, max_treedepth=15, stepsize=.001))

My questions are:

  1. Why do I get (more) divergent transitions in the less complex model, and less or none in the more complex one, given that I have so little data?
  2. How I can solve the divergence problem for dat_v, for which I didn’t manage to get a well-working model?
  3. Any idea why it should be harder to eliminate the divergent transitions for dat_v more than for dat_c?


System information:

Without deeply understanding your model and data, here are my few cents:

  • One reason divergent transitions can occur is that the model is a very bad fit to the data.
    • Adding more flexibility can let the model be a less wrong fit, but likely does not remove the root cause
  • By default, brms assumes normally distributed response (the family parameter). Your data is not normally distributed (as it is always positive). Lognormal, Exponential or gamma family might work better - and it also likely is a better representation of the data generating process. I now little about phoneme duration, but I believe available theory would tell you which of those distributions is likely to match reality.
  • You can test that your model and priors are sensible via prior predictive checks (see https://arxiv.org/abs/1709.01449 for more details)
Thanks for the input!
This already sounds like good starting points. I’ll give it a try.

Just to shortly follow up on this: the exponential family turns out to be reasonable enough, and the models work without problems.

Thanks again for your help!