NUTS differences in Stan vs paper


#1

I don’t seem to be able to post in the developer section :( so forgive me for posting here.

I couldn’t find this in the new (discourse) discussion forum https://groups.google.com/forum/#!topic/stan-dev/gPHJdAns6rc from which I quote:

I’m wondering if there have been any changes to the NUTS algorithm since the paper was published.

Yes. See https://arxiv.org/abs/1601.00225, Section 2.

I have read through Section 2 and two termination criteria are given: exhaustive and no u-turn. As far as I can see the latter is the same as in the original paper (assuming the metric is Euclidean). Can someone spell out the difference explicitly?

BTW my intention is to reproduce the results in Section 4 of the referenced paper.


#2

Section 2 covered both the termination criterion and how to sample from a trajectory. We no longer use a slice sampler to sample a state from a trajectory, but rather sample directly from the
marginal probabilities as discussed in the paper.

If you want to know the exact implementation then please read the code.


#3

I have never read C++ before so I hope you forgive anything that would be obvious to someone who knows the language.

I can see you changed from slice sampling to multinomial sampling in 05dcaa7. And (some of? all of?) the variable names match the paper e.g. z_minus I assume matches $z_-$. That is very helpful :)

And I have found the check for the u-turn in base_nuts.hpp:

      // Break when NUTS criterion is no longer satisfied
      rho += rho_subtree;
      if (!compute_criterion(p_sharp_minus, p_sharp_plus, rho))
        break;

So I think I can see what is going on. Thanks very much :)

Anything else I should look at or be aware of?


#4

I do think you found the essentials.