Quick question about HMC

Hi all,

I came across this nice interactive gallery for demonstrating MCMC by Chi Feng. The Markov-chain Monte Carlo Interactive Gallery. With this gallery and Michael Betancourt’s wonderful papers on HMC, I do have one question. I mostly understand the conceptual idea of keeping the algorithm from drifting outside the typical set, and also how it keeps from crashing into areas of very high density. What I don’t quite understand is how it stops and yields a value. That is, if you look at the gallery for HMC under the “standard” distribution, it starts the sojourn and then eventually stops, with some green line connecting where it started and where it ended. What aspect of HMC decides where it ends and yields the estimate? I hope that was clear.

Thank you,

David

1 Like

See here.

There are two questions here, and multiple schemes for answering each. One is how you figure out when to stop integrating. The other is, given a trajectory that has been stopped, how do you pick a point along it to treat as the next estimate.

For example, in one version of static HMC as described here 15.1 Hamiltonian Monte Carlo | Stan Reference Manual, the “length” of the trajectory is fixed a priori, and the point reached at the end of the trajectory is subjected to a Metropolis accept/reject step.

In the version of dynamic HMC described here 15.2 HMC algorithm parameters | Stan Reference Manual (and used by Stan by default), the “length” of the trajectory is determined on the fly based on when it reaches a stopping criterion, and then the next MCMC draw is determined by weighted multinomial sampling along the entire trajectory. Note that the manual is being imprecise when it refers to this version of dynamic HMC as “NUTS”. Everyone agrees that NUTS uses the stopping criterion described in this section, but some people (notably @betanalpha and other true experts in this arena) restrict “NUTS” to a specific algorithm that pairs this stopping criterion with a method for picking a point along the trajectory that is no longer used by Stan’s dynamic HMC implementation (and also is not the Metropolis accept/reject at the trajectory’s terminus described above).

I put “length” in quotes above because I’m using it to mean the number of steps taken by the numerical integrator. If you like to think about the Hamiltonian dynamics via analogy to a physical system, this notion of “length” could lead you astray, since the integrator can travel different physical distances in different steps (depending on how fast the particle is moving–i.e. the kinetic energy). The thing that is fixed in the hypothetical physical system is the integration time with each integration step corresponding to a fixed small timestep. But the word time isn’t unambiguous either, because in the actual world that we and your computer inhabit (rather than the hypothetical physical system being simulated), it is not true that static HMC integrates for a fixed amount of time!

3 Likes

This is a great answer. Thank you.

1 Like