Issue with dual averaging

monnahc · March 16, 2020, 9:20pm

To add on to this old conversation, one issue I noticed with dual averaging is that the final average (approximate) acceptance rate is almost always above the target adapt_delta.

When I look at the adaptation parameters (via get_sampler_params) it looks to me like the final window size is too small. Per the above discussion the algorithm starts high then drops initially to counter that, eventually coming back up to the target. But the final window (adapt_term_buffer) is only 50 by default and it looks like that is not long enough to stabilize. For instance here’s a simple 3 parameter multivariate model with 10 chains, warmup=1900 and iter=2000 using default settings in Rstan (so adapt_delta=0.8). The final step sizes range from 0.018 to 0.024, and the acceptance ratio ranges from 0.88 to 0.94 across the 10 chains, clearly all are much higher than the target of 0.8.

It seems to me that the step sizes are all too small because of that short final adaptation window, seen in this plot:

If I increase adapt_term_buffer to 1000, obviously extreme, then the final step sizes range from 0.027 to 0.035, and acceptance ratio ranges from 0.82 to 0.86. So bigger step sizes, which shifts closer to the target by about 0.6.

I’m not proposing a terminal buffer size of 1000. I just noticed this behavior and it is unexpected from a user (specify target of 0.8 and get >0.9), and it appears closely related to this discussion topic and how the step size changes during the early part of an adaptation window. It maybe worth considering this aspect of the adaptation when considering alternatives, or at a minimum increasing the default for adapt_term_buffer to be larger.

Topic		Replies	Views
Possibility of using dual averaging technique for the whole sample (not only during warm up) General	7	633	April 2, 2018
Adaptation of timestep in pure HMC Algorithms	7	852	February 28, 2020
Dual-averaging for other MCMC algorithms Algorithms mcmc	2	807	November 2, 2018
Details on how Stan adaptively tunes the HMC parameters? (i.e. mass matrix, step size and leapfrog steps) General algorithms	4	665	February 20, 2023
All masses below 1 and max treedepth being hit Modeling	3	588	July 26, 2017

Issue with dual averaging

Related topics