Riemann manifold HMC in Stan

2014 was…a more optimistic time.

Back in the early days of Stan our understanding of Hamiltonian Monte Carlo was relatively immature. Riemannian Hamiltonian Monte Carlo was a promising generalizing of existing implementations of Hamiltonian Monte Carlo but there were plenty of implementation challenges remaining including a useful metric (https://arxiv.org/abs/1212.4693) and generalizing the original No-U-Turn algorithm to work with any variant of Hamiltonian Monte Carlo (https://arxiv.org/abs/1304.1920). While those more theoretical results were being worked out we were also redesigning the Stan back end and I took advantage of refactoring the Hamiltonian Monte Carlo code in Stan to be extremely modular so that it could support different kinetic energies (constant/varying metric, metrics of different structure), different numerical integrators (explicit, implicit, varying orders), and the like. To do this day there is a dynamic Riemannian Hamiltonian Monte Carlo implementation in Stan – it just isn’t exposed through most of the interfaces!

Why isn’t that Riemannian Hamiltonian Monte Carlo implementation exposed? Well at the same time we were developing the theory and implementation for Riemannian Hamiltonian Monte Carlo we were also learning more about the tuning of the numerical integrator (https://arxiv.org/abs/1411.6669) convergence properties of the method (https://arxiv.org/abs/1411.6669). In particular based on the stability of the numerical integrator I introduced the concept of a divergent transition as a diagnostic of incomplete exploration. The problem with Riemannian Hamiltonian Monte Carlo is that the numerical integration can actually be unstable in different ways, some related to the geometry (and divergences) and some related to the implementation (in particular the fixed point iterations needed to evaluate the implicit integrator). In practice it wasn’t clear how to differentiate these two issues and hence how practitioners could respond in principled ways.

Ultimately the numerical integration of Riemannian Hamiltonian Monte Carlo was too fragile and required too much hand-tuning from the user to work reasonably well. As the goal of Stan was to hide those algorithmic details from the user as much as possible this proved to be a very unwelcome user experience and so enthusiasm for these methods waned.

At the same time we were learning more about the kinds of model degeneracies that tended to break the numerical integrator in Hamiltonian Monte Carlo (https://betanalpha.github.io/assets/case_studies/identifiability.html) and various ways to resolve those degeneracies by modifying the model. Because Stan users specify their own model through a Stan program already, which was a much more natural extension to the Stan user experience than forcing users to learn more about the underlying (and complex) algorithms.

In fact I recently showed that the reparameterizations that can improve the performance of Euclidean Hamiltonian Monte Carlo (constant metric) are actually equivalent to running Riemannian Hamiltonian Monte Carlo (https://arxiv.org/abs/1910.09407). Consequently playing around with the model implementation can achieve the same performance as playing around with the algorithm!

21 Likes