I’m trying to follow base_nuts.hpp, but it’s challenging
to try to see the forest for the trees.
From what I can tell, it’s not doing what the original
NUTS algorithm did and just selecting from the newly
built subtree (C^{new} in the NUTS paper), but rather
selecting the result from the new subtree if the new subtree
has a higher sum of densities. Is that right? Does it
work out the same way because Matt was trying to get a uniform
distribution and now we’re trying to get a discrete (multinomial)?
I just don’t see how it preserves balance the way the original
approach does.
I also can’t quite get the build left and build right
to line up with what I see in the original paper. The
original seems to have built the first tree, then based on the the
direction randomization (s’), extended it either left or
right (wish this had been called forward and backward for
the time analogy).
I’d suggest calculating log_accept_prob and applying log() to
the uniform(0,1) rather than exponentiating the log accept prob.
Less likely to be numerical issues. But that’s not the cause
of this problem.
The log_sum_exp(a, log_sum_exp(b, c)) would
benefit from a ternary log_sum_exp calc—not sure it’d be worth
allocating a vector here to cut down on function calls, including
some expensive ones to exp(). Probably.
- Bob