Running the same brms model more than once gives widely different number of divergences. Why?

I have been running brms models with in the zero-inflated poisson family. If I run the model and get some high r hat values (1.4) and many divergences. If I run the same model again, without making any changes, I can get a good fit, all rhats < 1.01, good numbers of ESS’s, and a small number of divergences. Note that I don’t specify a seed number. What is happening? Does the first run influence the second run, even though the compilation is repeated?

No. There is some stochasticity in the fitting. It sounds like this is a case of a problem where either adaptation is fragile, or there is a nontrivial probability of a chain encountering a minor mode and getting stuck there.

I can get a good fit, all rhats < 1.01, good numbers of ESS’s, and a small number of divergences

Be very cautious saying that a “small number of divergences” is a “good fit”. Based on the details you gave, it seems very possible that there is a problematic region of parameter space that the sampler struggles to explore. Any number of divergences can indicate the presence of this. I would venture to guess that in the “good fit” case, the sampler manages to avoid this region more or less by chance. In the “bad fit”, on the other hand, one or more chains perhaps get stuck in this region generating divergences and throwing off Rhat values.

In both cases, you don’t necessarily know if that problematic region is important to your inferences. I would more carefully explore the posterior fits of your model to see if you can identify where the divergences are coming from – it may be the case that you need to impose stronger priors or reparametrize your model to get more reliable inferences.

2 Likes

Yes, you are right. I see regions of the pairs plot of some of the problematic models that have a concentration of divergences in certain regions. It may occur even if the Rhats < 1.01.

By stronger priors, do you mean to restrict the region of sampling to exclude the problematic region?

The problem here is that \widehat{R} can be fooled by mixing well through only part of the posterior. We don’t know how to diagnose that with just sampling.

The only way we know to systematically diagnose these problems is to use simulation based calibration (simulate parameters, simulate data from parameters, fit, and check calibration).

That’s one approach if you really want to exclude a region, though I’d recommend doing it in a “soft” way rather than with a hard constraint. The other approach is to reparameterize. The root cause is almost always too high curvature, which can often be reparameterized, for example with a non-centered parameterization for hierarchical models.

Thanks everyone. I think the situation is as clear as it can be.