Blog: Taming Divergences in Stan Models

martinmodrak · February 19, 2018, 7:01am

So I am a bit nervous about this one. I tried to write a post I would have liked to read early in my love affair with Stan. I am compiling the strategies I have used to handle divergences and my understanding of what divergences are. But I am no expert yet and my calculus is rusty, so I am a bit out of my depth and I hope I didn’t write anything stupid. The post is aimed at people whose calculus is also rusty or with little exposure to calculus at all. All corrections welcome.

http://www.martinmodrak.cz/2018/02/19/taming-divergences-in-stan-models/

betanalpha · February 19, 2018, 8:16am

Divergences are not discussed in the original NUTS paper. You’ll likely want to reference instead https://arxiv.org/abs/1701.02434 which has an extensive discussion of divergences and how they relate to the stability of numerical integrators and the geometry of the target distribution.

martinmodrak · February 19, 2018, 8:34am

Good point, reference added, thanks.

stijn · February 19, 2018, 8:53am

This is an awesome overview. I wish it was around when I started struggling with divergences. One more piece of advice (but that would ruin the nice list of 10) would be to check for coding errors. From memory:

Leaving an unused unbounded parameter lingering around without a prior can lead to divergences.
Getting too early out of a loop: (for n in 1:K) instead of (for n in 1:N) with K < N leads to the same issue as above.
Stupid coding and algebra errors like a multiplication instead of addition, square root instead of square, exp instead of log, missing minus sign, can lead to numerical problems (overflow or Nan) and introduce identification problems.

I have spent some time looking for a statistical problem which was actually a coding/algebra problem.

martinmodrak · February 19, 2018, 9:38am

That has very much been my experience as well, but it didn’t occur to me to include it. Putting this as a proud #1 :-) - Thanks!

betanalpha · February 19, 2018, 10:03am

Correct if I’m wrong, but you didn’t link to this earlier case study, https://betanalpha.github.io/assets/case_studies/divergences_and_bias.html, which discusses divergences in the context of hierarchical models and how to use their spatial distribution to investigate problems. Might also be a useful reference.

martinmodrak · February 19, 2018, 10:24am

I linked to it as an example for non-centered parametrization. I slightly changed the title of the link to make sure it advertises its content properly.

betanalpha · February 19, 2018, 10:38am

Ah. Might be worth referencing the link in the context of (5) as well (7) as it demonstrates the use of pair plots to identify the source of divergences.

Guido_Biele · February 19, 2018, 11:53am

Nice to see lots of accessible information about divergent transitions collected in one place!
Given that you were asking for feedback: I would put the simulation of data as one of the first three points (not #9). For me, having to simulate data nearly always leads to a better understanding of “real” data and helps to formulate better a better Stan model.

Topic		Replies	Views
Divergences in a simple uniform distribution Modeling fitting-issues	2	432	June 8, 2020
Divergent transitions - possible to relate to certain parameters? Algorithms	4	1301	May 26, 2017
Advi to avoid divergent transitions? Modeling	12	1705	May 19, 2017
Divergences with no pattern, master thesis Modeling rstan , techniques , fitting-issues , divergences	7	619	May 1, 2023
Divergent transitions General performance	3	1705	December 15, 2018

Blog: Taming Divergences in Stan Models

Related topics