Reference for current version of NUTS/HMC in stan?

defjaf · February 21, 2022, 3:57pm

Is there a complete and self-contained reference for the current implementation of Stan’s Monte Carlo inference engine? It is briefly discussed in the documentation which imply that it’s just Hoffman & Gelman’s original NUTS with (just one?) modification regarding the selection of the final location of the trajectory from Betancourt’s 2016 paper. (I admit that I find the latter quite hard to parse, albeit without having given it too much effort!)

But at least in this thread there does seem to be some discussion between @betanalpha, @Bob_Carpenter, and others about whether there are any additional differences from just-plain-NUTS which just cuts off without a definitive conclusion…

yizhang · February 23, 2022, 5:23pm

No. But I think we should.

rok_cesnovar · February 23, 2022, 5:34pm

The last time I asked around on this, I think we settled on the implementation being what is detailed in https://arxiv.org/pdf/1701.02434.pdf (specifically Appendix A.5) plus this PR: Feature/issue 2799 robust no u turn by betanalpha · Pull Request #2800 · stan-dev/stan · GitHub

I don’t think anything substantial has changed since.

maxbiostat · February 23, 2022, 6:29pm

@betanalpha

WardBrian · February 23, 2022, 7:33pm

There is an opens doc issue to try to clarify some of this: Clarify that Stan uses non-standard/improved NUTS · Issue #436 · stan-dev/docs · GitHub

betanalpha · February 25, 2022, 8:31pm

This is correct. To summarize the main changes from the original NUTS algorithm to the current Stan implementation are

The termination criterion has been modified to a more geometrically formal termination criterion based on momenta and not positions.
The termination criterion is checked not just at the boundaries of each binary subtree but also between subtrees. These additional checks help to avoid discretization issues that result in numerical trajectories that are too long.
States are not sampled from numerical trajectories using slice sampling but rather direct multinomial sampling.
The adaptation of the step size still uses dual averaging but the target has been modified based on rigorous theoretical calculations. The adaptation of the inverse metric has also been modified slightly.

I’m consider the diagnostics specific to Hamiltonian Monte Carlo as orthogonal to the algorithm structure.

There is an open pull request on improving the statistic used for the the step size adaptation, but that’s been languishing in pull request hell.

stevebronder · February 25, 2022, 8:42pm

Which pr is that?

betanalpha · February 25, 2022, 8:59pm

github.com/stan-dev/stan

Feature/2789 improved stepsize adapt target

stan-dev:develop ← stan-dev:feature/2789-improved_stepsize_adapt_target

opened 10:39PM - 11 Oct 19 UTC

betanalpha

+99 -81

#### Submission Checklist - [X] Run unit tests: `./runTests.py src/test/unit`… - [X] Run cpplint: `make cpplint` - [X] Declare copyright holder and open-source license: see below # Summary Addresses #2789 by updating the acceptance statistic used in the step size adaptation. # Intended Effect Yields more effective samples per gradient evaluation for models with dimensions more than O(10). Exact threshold depends on correlations in the target density. This new branch considers each state as a proxy proposal but weights the corresponding acceptance probabilities by the state weight, e^{-H(q, p)}. In this way those states that are likely to be chosen have a greater influence on the proxy acceptance statistic, and any state with vanishing weight is not considered at all. This will always lead to a higher proxy acceptance stat for a given step size, which will cause the adaptation to push toward higher step sizes and, for a fixed integration time, cheaper numerical trajectories. # How to Verify Compare effective sample size per gradient evaluation against develop. See also the suite of test results in the report attached to https://discourse.mc-stan.org/t/request-for-volunteers-to-test-adaptation-tweak/9532/53. # Side Effects Pushing the step size to larger values makes the generalized No-U-Turn criterion less accurate. Combined with the multiplicative trajectory expansion this can result in a heavy tail of longer numerical trajectories, especially in lower dimensions. # Documentation Inline. # Copyright and Licensing Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Michael Betancourt By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause) Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

nhuurre · February 25, 2022, 10:16pm

So you’re saying that fourth change isn’t part of the current Stan implementation?

I wasn’t going to advertise it on Discourse because IIRC back when that pull request was reverted someone expressed the opinion that maybe we shouldn’t discuss contentious issues on the same forum new users come to ask for help but a couple of days ago I posted a related question on GitHub so if you want to move on from “pull request hell” you could explain these rigorous theoretical calculations there.

betanalpha · March 15, 2022, 8:19pm

Stan currently uses a more effective adaptation target than the original NUTS paper, and there are still better adaptation targets to consider.

defjaf · March 15, 2022, 9:25pm

I think these responses make it pretty clear that we do need some better documentation on the current algorithms!

(Is the current “more effective adaptation target” actually discussed somewhere?)

Topic		Replies	Views
NUTS differences in Stan vs paper Algorithms	3	1123	February 2, 2017
New algorithm: Gradient-based Adaptive Markov Chain Monte Carlo Algorithms mcmc	13	2099	February 16, 2022
How to compile and profile NUTS in stan/src/stan/mcmc/hmc/nuts CmdStan	2	523	November 8, 2022
Choice of sampler General	3	454	December 11, 2018
Insights from Dynamic HMC animation Algorithms	2	977	September 16, 2020

Reference for current version of NUTS/HMC in stan?

Related topics