Divergent transitions, or convergence, depending on random seed

Aaron_Mackey · July 7, 2020, 5:43pm

I have a model that sometimes, depending on the choice of seed, will converge cleanly, with no error diagnostics, and other times does not converge, with many divergent transitions, despite increasing the adapt_delta to 0.99. Paradoxically (at least to me), rerunning with the same seed that at adapt_delta of 0.99 “works”, then fails to converge when I increase adapt_delta to 0.999 (without changing the seed). I am using rstan_2.19.3 if it makes any difference. Any general advice to better manage such instability? thanks, -Aaron

andrjohns · July 8, 2020, 6:01am

Hi Aaron,

This indicates that there are parts of the posterior that are difficult for the sampler (either due to tricky geometry or other model issues). The choice of seed is determining whether or not the sampler ends up in those difficult areas and runs into issues. Similarly, by increasing adapt_delta, the sampler is more efficiently navigating the posterior and arriving in those difficult areas.

If you’ve got very wide priors (i.e. normal(0, 1000)), try reducing those to something more reasonable. Also, have a look into reparameterising your model more information in the Stan manual here.

I wouldn’t recommend just using the results from the seed that doesn’t give divergent transitions, since the divergences are indicating a problem with your model that should be addressed.

Also, I’ve no doubt mangled the NUTS/sampling explanation a little here, so I’d be interested in having @betanalpha correct anything I’ve gotten wrong

Aaron_Mackey · July 8, 2020, 8:58pm

Andrew, thanks for your clear explanation. Indeed, simply adding priors to shy away from insane values stabilized everything quite nicely (and of course it all runs much faster now too). When I don’t bother to put priors on parameters in the model block, does that equate to normal(0, 1000), or something else entirely? -Aaron

betanalpha · July 9, 2020, 2:01am

The implicit prior density is constant, otherwise known as a “uniform prior”, although that term can be somewhat ill-defined. For more detail see the first there paragraphs of https://betanalpha.github.io/assets/case_studies/stan_intro.html#436_model_block.

Topic		Replies	Views
Trade-off between adapt_delta (+ divergences) and autocorrelation? Algorithms	5	853	June 3, 2020
Divergent transitions in hierarchical model Modeling fitting-issues	26	1915	November 7, 2019
Strange convergence behaviour General	11	1226	August 3, 2017
RStan divergence issue - assistance with experimental research Modeling rstan , fitting-issues	6	108	May 21, 2025
Reparameterizing and Convergence with rstanarm rstanarm rstan	12	1266	May 2, 2018

Divergent transitions, or convergence, depending on random seed

Related topics