When encountering divergences, I often have the problem that these divergences are spread out all over the posterior distribution (or at least the part of it that was already sampled). This makes it difficult to pin down the cause of these divergences. Only in the minority of cases I encountered divergences which are thightly grouped in a certain area, thus giving a hint about what causes them.
I think that these divergences are so often spread out is because of two things:
- The location where the divergence is shown is not where the energy error occurred, but the point from which the sample started moving before hitting the divergence-causing region.
- Because HMC is very good at traversing the posterior, the start of the path can be somewhere completely different than where the problem occurred.
Thus, I imagine that it would be highly useful if we could get more information about the divergent transition, such as the coordinates of the leapfrog step with the largest energy error, or just simply the exact path the sampler took.
In some cases there is a region that causes the sampler to make its stepsize very small, leading to long runtimes and/or treedepth warnings, but without causing divergences. In this case diagnosing divergences doesn’t help to find the problem. Therefore it would be helpful if we could also get info about the locations and energy errors of the individual leapfrog steps for those transitions which were not divergent, to check which parts of the posterior are most difficult for the sampler.
I feel these things would greatly help me when debugging/optimizing my models.
Assuming that other people are sharing my experience, especially newcomers to stan, this could also significantly lower the entry barrier for new stan users.
I think I once saw an example of a HMC-implementation that actually had the feature of showing the sampler path of divergent transitions. Unfortunately I couldn’t find the article about it. I vaguely remember it beeing about pymc3.