In my Bayesian seminar today, we discussed at length how step-size and adapt-delta change the way we explore and sample from the posterior distribution. We were looking at the Hoffman & Gelman (2014) paper, but I’m wondering if there is a more intuitive or accessible explanation of what these hyperparameters do, how they affect how we explore the posterior, what the consequences of doing this is, and the thinking behind it was?

Does anyone know if a blog post or journal article or explanation elsewhere that explains the NUTS in a little more broader, conceptual terms?