HMC (jittered) vs. NUTS on 1000-dimensional standard normal

betanalpha · April 8, 2019, 3:18am

This is a great. I just have a few comments.

I think it will be helpful to separate out static HMC methods (which use a fixed number of leapfrog steps) from the “vintage” HMC method that used a fixed number of leapfrog steps and only considered the end point of the trajectory in a Metropolis proposal. Even with a fixed number of leapfrog steps one can improve considerably on vintage HMC by sampling from the entire trajectory instead of just using the end point.

Speaking of sampling points from a trajectory, the current implementation in Stan isn’t exactly a multinomial sample from the entire trajectory. Following the original NUTS implementation it instead first samples a sub-trajectory (i.e. one of the trajectory expansions) with a bias towards later sub-trajectories and then samples a point from that sub-trajectory. See Section A.3 of https://arxiv.org/abs/1701.02434. This is largely what causes the antithetical behavior for the means.

When considering the optimal number of leapfrog steps the vintage HMC one has to consider the error in the simulated trajectory. The expected jump isn’t quite uniform because the error has a pretty systematic structure – it starts off small, grows, then shrinks again as you move towards the other side of the energy level set. See Figure 5 of https://arxiv.org/abs/1411.6669 for an illustration of the behavior (which is actually exact for a 1D Gaussian target). In other words when the Metropolis acceptance step is taken into account the expected jump distance will stay close to initial point for a while before jumping further away once the number of leapfrog steps get long enough to probe the other side of the level set.

Also, if you extend the step sizes under consideration past 2 you’ll see the instability of the symplectic integrator arise and cause the effective sample size to plummet (although it won’t really mean much anymore as you’ll no longer have unbiased estimators).

Topic		Replies	Views
Comparing Stan's adaptation phase to that of nuts-rs? Algorithms	20	1688	August 11, 2023
NUTS misses U-turns, runs in circles until max_treedepth Algorithms	66	5575	August 31, 2019
Heuristics for the advantages of NUTS/HMC/Stan vs RWMH General	30	3992	September 13, 2017
Reference for current version of NUTS/HMC in stan? Algorithms	10	1789	March 15, 2022
NUTS vs HMC Algorithms	8	7063	August 4, 2020

HMC (jittered) vs. NUTS on 1000-dimensional standard normal

Related topics