Lp distribution for divergent and non-divergent transitions

Hi all,

I’m fitting a hierarchical model for multiple subjects, and I get 1% divergent transitions (across 4 chains with 1000 warmup sampled and 1000 post-warmup samples).

Scatter plots of the population-level priors do not show particular regions of the posterior which are more prone to divergent transitions.

The energy plot from Betancourt 2017 (mcmc_nuts_energy in bayesplot) looks reasonable (perhaps the delta-E histogram is a little narrower):

and finally, the bulk of the distribution of log-likelihood values is similar among divergent and non-divergent transitions:

Since I have no experience with this violin plot, I wanted to ask for your advice - does this seem like the divergent transitions are not indicative of failure to explore specific parts of the posterior, as the bulk of the distribution is very similar? Or is the lack of the very end of the tails in the upper right plot indicative of a problem?

I should note that I’m currently rerunning this model, as I noticed some unwanted autocorrelation for some of the parameters.

Many thanks,

Roey

Happy New Year,
Roey

Hierarchical models are particularly prone to divergences as they can induce some tricky posterior geometry for the sampler (emphasis on can). Have a look at the Stan manual section on Reparamaterisation which discusses this in more depth and also covers the use of the non-centered parameterisation for easier sampling (in some cases): 25.7 Reparameterization | Stan User’s Guide

Thank you for your response, Andrew!
Indeed, we’ve reparametrized many of our parameters with non-centered parametrization. Since I could not identify any clear funnels with scatter plots of various parameters, I turned to other visualization tools. My question is specifically about the violin plot and its interpretation in light of the shape similarity between non-divergent and divergent transitions.