Details on how Stan adaptively tunes the HMC parameters? (i.e. mass matrix, step size and leapfrog steps)

I was wondering if anybody knew where I could find out details of how Stan adaptively tunes the three HMC parameters:

  • Step size
  • number of leapfrog steps
  • mass matrix (aka the ‘metric’)

I asked Michael Betancourt (@betanalpha ) about this and he pointed me to some useful papers describing the former two - he informed me that Stan uses (a variation of) dual averaging for the step size and the method described in the appendix of this paper for the leapfrog steps.

However I can’t find out much detailed information about the mass matrix adaptation. There is some information in the CmDStan R documentation but it’s not particularly detailed.

During the windowed phase, at the end of each window Stan computes a regularized estimate of the marginal variances of the unconstrained parameters. See https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/var_adaptation.hpp

2 Likes

Thanks!

I was wondering if you also knew how Stan adapts L and epsilon differently compared to the original NUTS paper? I understand that it adapts the log step size rather than the step size and also doesn’t use slice sampling but rather samples directly from the multinomial distribution but was wondering where I could find the C++ code which implements this

In particular n ph. 49 of “A conceptual introduction to HMC” it says:

However i’m not sure how to exactly modify the original NUTS algorithm (alg. 6 in the Hoffman & Gelman paper) to incorporate this or what changes to make if tuning the log step size rather than step size

I’m not the right person to serve as a guide to this code (I wish I was!) but in case it helps I am pretty sure that all the relevant code is here (including the subdirectories) stan/src/stan/mcmc at develop · stan-dev/stan · GitHub

4 Likes

Thanks i’ll have a look!