Adapt_delta

I ran a Stan program and got this warning message:

There were 23 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup

I went to the webpage and read the description, which was great. And then I re-ran my code setting adapt_delta:

fit <- stan(“chickens.stan”, data=data, control=list(adapt_delta=0.9))

And it worked fine.

Here’s my question. If our first recommendation is to increase adapt_delta, why not do this automatically? As a user, I’d find that convenient.

4 Likes

increase adapt_delta means take smaller step to approach, it will take longer time at the same time.

I’m still a beginner with Stan too, but as far as I understand, adapt_delta is the average probability of accepting a posterior draw. The probability of accepting a posterior draw is related to step size - how far the sampler “jumps” on each draw. To increase probability of acceptance, the sampler needs to decrease step size, and take smaller, more careful steps.

If you imagine the posterior (or the typical set) as a tall hill in the middle of a flat plain, then what a Monte Carlo sampler does is it tries to map out the shape of the hill by taking random steps around the hill & measuring height at each step. Additionally, it only takes a step if the elevation is higher in the next location, or takes the step with probability = (next location elevation / current location elevation) when the elevation of the next location is lower. If the steps that the sampler takes are very big, it will often miss or “overshoot” the area of higher elevation, and as such it’s acceptance rate will be lower. If the sampler takes very small steps, its acceptance rate will be high but it will be very slow and take a long time to explore the hill.

2 Likes

same as my understanding, like a learning rate in deep learning.