Stepsize randomisation

felixmueller · July 4, 2019, 11:40am

I have a question regarding the stepsize and stepsize_jitter parameters. I am using rstan, but I guess that the other interfaces have similar parameters.

I would like to know how exactly the “jittering” is done, i.e. how the supplied stepsize is randomised. The parameter stepsize_jitter is supposed to be in [0, 1] and I would guess that higher values randomise more. But what kind of randomisation takes place - is it uniform, gaussian, …?

Moreover: Is it possible to supply my own randomised timesteps, e.g. an array (length iter) of timesteps to be used for the iterations? The specification for the parameter stepsize says it’s double, which makes me guess I cannot simply supply an array.

bbbales2 · July 4, 2019, 1:44pm

The docs for this are over here: 14.2 HMC Algorithm Parameters | Stan Reference Manual

The actual function is here: https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/base_hmc.hpp#L174

I don’t know of a way to do that without modifying Stan itself.

betanalpha · July 4, 2019, 5:40pm

Stepsize jitter is strongly not recommended and will be deprecated in upcoming versions of Stan.

felixmueller · July 5, 2019, 9:25am

Here again, thanks for the quick answer @bbbales2!

Why is that @betanalpha? Will the stepsize be randomised another way or will it simply be static throughout the MC iterations?

bbbales2 · July 5, 2019, 12:11pm

The stepsize only changes during the adaptation stage (after each MCMC draw). During the sampling stage the stepsize stays the same. You should be able to use get_sampler_params with inc_warmup = TRUE in Rstan to see what’s actually being used.

betanalpha · July 5, 2019, 1:30pm

Step size will vary during adaptation, and that history of step sizes can be recovered in RStan and PyStan as @bbbales2 notes (it’s included immediately in CmdStan so there you just need to save the warmup iterations). The step size in that adaptation phase is technically random, but only because the adaptation is conditioning on the random realization of the Markov chain.

Adding jitter to the step size will randomize it during the main sampling phase as well, which you can see in the same way by examining the non-warmup samples. Once this jitter is deprecated the step size will remain constant during the main sampling phase.

Jitter was originally introduced by Neal as a way of including occasionally small step sizes that might allow exploration of regions of high curvature. Unfortunately this requires being in the right place at the right time, and the probability that everything coincidently aligns is quite small. Indeed it ends up just mostly compromising performance away from the regions of high curvature where the fluctuations to smaller step sizes require most costly transitions.

There is another argument for varying step sizes that comes from a theoretical ergodicity perspective. Here the variation is not so much a question of performance but rather mathematical convenience for getting the numerical integrator to “smear out” and cover the entire parameter space (and even then its hard to decouple the varying step size from varying integration time it induces). This approach requires an exponential distribution of step sizes which not only allows for smaller step sizes but also much larger step sizes than nominal. Empirically this kind of variation does not seem to offer much in terms of performance or robustness, but that might not be unexpected given that it is aimed at static integration time methods and not the dynamic methods that we employ in Stan.

bbbales2 · July 5, 2019, 3:55pm

Whoops, my bad. I didn’t realize this was the case.

Topic		Replies	Views
Using a fixed step size for HMC (or NUTS) and other sampler options Algorithms mcmc	21	3346	June 27, 2019
How can I fix the stepsize in Rstan General rstan	3	777	September 15, 2021
Integration time parameters for HMC without NUTS RStan rstan	2	406	September 30, 2019
Details on how Stan adaptively tunes the HMC parameters? (i.e. mass matrix, step size and leapfrog steps) General algorithms	4	683	February 20, 2023
How do I implement the total/pure HMC method in Stan? General	5	787	November 8, 2021

Stepsize randomisation

Related topics