Sampling from the prior - why am I seeing divergent transitions?

betanalpha · December 10, 2021, 9:18pm

Conventions vary from field to field. The commonality is a prior density of the form \text{gamma}(y; \epsilon, \epsilon) or even \text{inv-gamma}(y; \epsilon, \epsilon) so that the density concentrates against y = 0 instead of suppressing it. The conventional value of \epsilon changes across time and disciplines.

Yes, while I had the Jacobian correct in my haste to get an answer out I made the cardinal sin of naive multiplication on the linear scale, dgamma(y, epsilon, epsilon) * exp(y) instead of canceling everything on the log scale to avoid floating point issues. Implementing the log density properly gives a result that agrees with yours,

Indeed the problem isn’t the location of the typical set relative to the initialization but rather the asymmetry in the log density function. If warmup spends too much time on the left then the adaptation will end up in too aggressive of a step size configuration which can lead to occasionally divergences when exploring the sharp drop off on the right. The stochastic nature of the adaptation can also lead to some fits exhibiting divergences and some fits not exhibiting divergences.

Increasing adapt_delta forces a less aggressive step size adaptation which should help, as does changing the prior model to soften that drop off in the log density function.

Topic		Replies	Views
Divergent transitions Modeling	12	1093	July 17, 2019
Choosing correct non-centered parametrization Modeling techniques , specification	9	867	October 2, 2020
Divergent transitions after warmup to be sloved Modeling rstan , techniques , fitting-issues , performance , math	9	2300	February 7, 2021
Divergences in a non-centered computational model Modeling fitting-issues	21	1557	October 30, 2019
Divergent transitions with hierarchical model Modeling	5	793	July 15, 2019

Sampling from the prior - why am I seeing divergent transitions?

Related topics