Divergence transition visualization

Hi, I’m trying to use plots to diagnose the divergence transition problem, but I have thousands of parameters, is there an efficient way to plot the trace plot or other necessary plots? Currently, I’m plotting 10 parameters each time and it will take forever to plot all. Thanks!

instead of plots look at the summary stats - which parameters have bad R-hat values?

Thank you! I just checked the summary stats and I have only three parameters that have rhat larger than 1.1, but I have 1673 divergent transitions when running 4000 iterations for three chains, is this usual?

no, that is not usual.

have you looked at the Stan User’s Guide? the relevant chapters are: https://mc-stan.org/docs/2_24/stan-users-guide/problematic-posteriors-chapter.html and https://mc-stan.org/docs/2_24/stan-users-guide/change-of-variables-chapter.html

There are some useful visual diagnostics in the Bayesplot package. I’ve found mcmc_parcoord to be particularly helpful (particularly when added to coord_flip(), so that you can easily see more parameters).

3 Likes

Thanks for the links! If I have only three parameters that have rhat larger than 1.1, I shouldn’t have 1673 this many divergent transitions, right?

Yes I tried those, but the problem is if I have too many parameters, the plots are small and unclear for each parameter and it takes long time to generate the plot with all the parameters

sorry, that’s wrong. each divergence is one iteration of the sampler where the sampler was unable to use the Hamilitonian dynamics to jump to a new point in the posterior distribution - in short the sampler got stuck over and over because the posterior geometry was too problematic for HMC.

divergences are explained here: 15.5 Divergent transitions | Stan Reference Manual, also here - the latter says this:

Stan uses Hamiltonian Monte Carlo (HMC) to explore the target distribution — the posterior defined by a Stan program + data — by simulating the evolution of a Hamiltonian system. In order to approximate the exact solution of the Hamiltonian dynamics we need to choose a step size governing how far we move each time we evolve the system forward. That is, the step size controls the resolution of the sampler .

Unfortunately, for particularly hard problems there are features of the target distribution that are too small for this resolution. Consequently the sampler misses those features and returns biased estimates. Fortunately, this mismatch of scales manifests as divergences which provide a practical diagnostic.

1 Like

You should be able to use the pars and regex_pars arguments of mcmc_parcoord() to only look at a subset of the parameters. I’d recommend looking at the ones with high R-hats to begin with. If things are still too compact to view, I’d recommend using it with coord_flip() and then saving the plot as a png file with a really large height. It may take a bit to save, but you should then be able to scroll through in an image viewer/web browser/something and identify parameters that may causing the problem.

Thank you for the detailed explanation! I read both links but I think I need to read more to fully understand the divergent transition.

Thanks! I will try coord_flip ()!