Help eliminating divergent transitions for gamma mixture model


#1

Background: We’ve been building up a gamma mixture model with censoring to help understand some interval data. Initially, our fit had serious problems (multiple modes, lots of divergences), but a combination of changes to the model and better priors seems to have fixed most of them. Unfortunately, we’re still getting a divergent transition about every 5000 steps or so. Also, sometimes one chain lags behind the other ones, taking several times longer to run (we think that this is related to the same bad geometry of the model that’s causing divergences, however). We have not finished adding parameters to this model but would like to fix the divergent transition before continuing further.

Attempts to fix issue: In addition to tweaking our priors, we’ve tried raising adapt_delta as high as 0.99, and made sure that both gamma distributions had a log link, as suggested in response to an earlier question. We’ve also tried looking at the divergences in various pairs-plots and in shinystan, but didn’t see any clear funnels, so we’re not really sure which part of the model is at fault.

I’ve included a copy of our model code below, along with a data frame. Do you see anything we could do to eliminate the bad geometry that’s causing the last few divergences (and the occasional super-slow chain)? Thanks in advance for any help you can provide!

Relevant data frame:
brms-data.csv (30.2 KB)

intervals <- read_csv(‘brms-data.csv’)
set.seed(5) # With this seed, we get one divergence
chains = 4

prior1 <- c(
  set_prior("gamma(50, 10)", class="shape1"),
  set_prior("gamma(100, 50)", class="shape2"),
  set_prior("student_t(10, 2, .5)", class = "Intercept", dpar = "mu1"),
  set_prior("student_t(10, 5, 0.5)", class = "Intercept", dpar = "mu2"),
  set_prior("student_t(10, -1, 1)", class = "Intercept", dpar = "theta2"),
  set_prior("student_t(10, 0, .5)", class = "b", dpar = "mu1"),
  set_prior("student_t(10, 0, .5)", class = "b", dpar = "mu2"),
  set_prior("student_t(10, 0, 1)", class = "b", dpar = "theta2"),
  set_prior("lkj(2)", class = "cor"),
  set_prior("lognormal(-2, .25)", class = "sd", dpar = "mu1"),
  set_prior("lognormal(-2, .25)", class = "sd", dpar = "mu2"),
  set_prior("lognormal(-2, .25)", class = "sd", dpar = "theta2")
)

fit1 <- brm(
  brmsformula(
    intervals | cens(censored) ~ 1 + phase + (1|male|male_id) + (1|trial),
    theta2 ~ 1 + phase + (1|male|male_id) + (1|trial)
  ),
  data = x, 


  family = mixture(
    Gamma("log"), 
    Gamma("log"), 
    # Enforces that mixture elements (short vs long)
    # have the same interpretation across chains
    order = TRUE
  ),
  prior = prior1,
  control = list(adapt_delta = 0.95),
  chains = chains,
  cores = chains, 
  inits = "random", 
  sample_prior = FALSE,
  init_r = 0.25,
  sample_file = "chains",
  iter = 2000
)

#2

Sorry for responding so slowly.

First of all, I think you already did a quite good job at getting rid of problems in your model.

You could try using another family for instance the “weibull” family, which I have observed
to be often a little bit more stable than gamma, while being able to fit similiarly shaped data.
For the weibull family, you may also try to set inits = 0, if initialization turn out to be hard.