Divergences with no pattern, master thesis

Hello, I am quite new to Stan, for my master thesis I have been using BUGS, but I had to move due to the complexity of the models. I’ve spent quite a bit of time understanding how it works behind the scenes, and I’ve seen that it’s important to take divergences into account.

I am using a “new” probability distribution. From 1500 iterations, 944 divergences come out. Raising adapt_delta to 0.99 brought them down to 780. I’ve looked at bayesplot tutorials and others but none help. I have tried to plot the divergence vs non-divergence random effects, but the only pattern is that the ones that diverge are more focused on the scatter. I haven’t found any pattern, and I don’t know if I could give the model as valid, or if I should put some justification in my thesis. The only pattern found is in teh acceptance rate, but it looks like a fine line. Thank you all in advance
divergence.pdf (24.1 KB)

What is your “new” probability distribution?

It’s the Skellam’s distribution. So it is not “new”, but I wanted to refer to it as user-defined one. When using random effects to take into account the longitudinal data is when the divergences arise.

Have you tried fitting your model to simulated data (from fixed parameters)? Do the divergences still occur then or only when applying it to the real data you collected?

Yes, I tried the same model for the dataset, removing the repeated observations and hence, also the random intercept. Works fine. Simulating manually a data set like the one I am using, with random effects, and using it in the bayesian model, also raises a warning of divergences, but this time only about 40. All parameters of the regression came close except for 2.

Is there any way to look for the “root” of the cause? Can I ignore divergences?

I wonder if there are some issues with your implementation of random effects. Have you looked into possible re-parameterizations?

It is the non-cetentered typical way, adding an intercept std_normal() to the linear regression, and multpliying it by sigma and adding beta[1].

I attach the pairs plot

Does the new distribution impose constraints on its parameters? For example, the following model gets many divergencies:

transformed data {
  int n = 1;
} parameters {
  real lambda;
} model {
  n ~ poisson(lambda);
}

The Poisson distribution requires lambda > 0 but the parameter is declared as real. The sampler tries to explore negative values and the resulting error is reported as a divergent transition.

The solution is to declare the parameter with a constraint so that the sampler knows to only explore positive numbers:

transformed data {
  int n = 1;
} parameters {
  real<lower=0> lambda;
} model {
  n ~ poisson(lambda);
}

or you may be able to avoid the problem with alternative parametrization.
Here poisson_log is a valid distribution for any value of lambda:

transformed data {
  int n = 1;
} parameters {
  real lambda;
} model {
  n ~ poisson_log(lambda);
}