Accessible explanation to the No-U-Turn Sampler

markhw · December 5, 2017, 10:00pm

In my Bayesian seminar today, we discussed at length how step-size and adapt-delta change the way we explore and sample from the posterior distribution. We were looking at the Hoffman & Gelman (2014) paper, but I’m wondering if there is a more intuitive or accessible explanation of what these hyperparameters do, how they affect how we explore the posterior, what the consequences of doing this is, and the thinking behind it was?

Does anyone know if a blog post or journal article or explanation elsewhere that explains the NUTS in a little more broader, conceptual terms?

bgoodri · December 6, 2017, 1:34am

https://arxiv.org/abs/1701.02434

monnahc · December 12, 2017, 6:20pm

Michael’s conceptual intro is great. I also tried to explain it to non-mathy types in this paper:

http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12681/abstract

I think it’s a nice complement to the other literature, but naturally that’s a biased opinion.

markhw · December 13, 2017, 12:51am

This is perfect, thanks!

tiagocc · December 13, 2017, 8:47am

This blog post by Richard McElreath is also a very good way to start imo.

markhw · December 13, 2017, 4:03pm

Ah, this is great! This is a perfect first introduction to the sampler, with nice interactives that one can use in a seminar.

Bob_Carpenter · January 9, 2018, 5:25am

adapt_delta just sets the target “acceptance rate” for the sampler. A higher target acceptance rate means adaptation will find lower step sizes. Once warmup’s done, these are locked in.

How adaptation works has changed over versions. But that target acceptance is now complicated as we’re not using the basic NUTS algorithm.

The main issue you run into is conditioning—the usual bugbear of any kind of gradient-based algorithm. If you get into a location in the posterior where the step size is too large, you get divergences. We only use gradient-based approximations (i.e., first order) of the real posterior curvature, so sometimes we need small step sizes to do that accurately.

Topic		Replies	Views
Tips/techniques for dealing with a rough/noisy posterior (lower bound on stepsize?) Modeling	3	355	May 20, 2022
Any intuitions on evaluating accept_stat__ & stepsize__? Modeling	3	578	November 11, 2020
Variational Bayes results seems sensible, but vary - What to change? Modeling variational-bayes	6	813	November 6, 2020
Tutorial on Monte Carlo EM and variants for MML and MMAP Algorithms	16	3457	October 22, 2018
Using output from optimization algorithms to initialize sampler Algorithms optimization , mcmc	6	1092	April 25, 2019

Accessible explanation to the No-U-Turn Sampler

Related Topics