NUTS misses U-turns, runs in circles until max_treedepth

The dot product becomes negative as soon as the angle between the vectors exceeds 90 degrees which is not a u-turn. To detect when the turn is near 180 degrees you need to compare the momenta not to each other but to some kind of a “pivot vector” that lies between them. I don’t think p1.(p1+p2)/2 is too strict. When sampling from a two dimensional IID normal Stan does limit about 15% of the transitions to a single leapfrog step so the current algorithm has no problem stopping at short u-turns. Limiting max_treedepth to 1 or even 2 seriously lowers effective sample size so I think even 15% is quite a lot.

If I restore the step sizes I get

\begin{eqnarray*} q_{n}-q_{1} & = & \frac{1}{2}\epsilon p_{1}+\sum_{k=2}^{n-1}\epsilon p_{k}+\frac{1}{2}\epsilon p_{n}+\epsilon^2\frac{f_{1}-f_{n}}{4}\\ & = & \sum_{k=1}^n\epsilon p_{k}-\epsilon\left(\frac{1}{2}p_{1}+\frac{1}{2}p_{n}\right)+\epsilon^2\frac{f_{1}-f_{n}}{4}\\ & = & \epsilon\sum_{k=1}^n p_{k}+O\left(\epsilon\right) \end{eqnarray*}

So no, the rho used by the u-turn criterion is a first order approximation to q_n-q_1 (or vice versa). It is the symmetrically shortened sum that is a second order approximation.

I can’t help but digress here. The above calculation suggests a nice visualization for the leapfrog trajectories. Positions q_k are represented by black dots in the obvious way. The force at position q_k is drawn as a vector that starts from q_k and has length \frac{1}{2}\epsilon^2 f_k . The momentum is then drawn as a vector \epsilon p_k placed such that its center coincides with the center of the corresponding force vector. It then happens that the momenta line up end-to-end and create a contiguous “spine” for the trajectory.

Circles are boring so I’ll use inverse square potential for the picture. (this potential is of course pathological due to the infinite peak in the middle and the improper flat tail but let’s not worry about those regions.)
Here is a trajectory for a two dimensional \frac{-1}{2\left|q\right|^2} potential starting at q=(1.5, 0) and p=(0, 0.66).

Symplectic geometry is more than just the position space. To see what the trajectory looks like in the momentum space move all the momenta to the origin while keeping the force vectors attached to their respective momenta. Now the forces create a contiguous path!

With all the relevant quantities in the same picture it’s possible to visualize different u-turn conditions. The original u-turn condition stops the sampler when the line connecting the first and last position makes an acute angle with at least one endpoint momentum.

uturn1

The condition in Stan 2.20 uses a line connecting the ends of the momentum spine.

uturn2

@fehiepsi proposed something like

uturn3

Indeed. I’m not even sure anymore if you can say it is more “complicated.” It has couple of variables more but their purpose is absolutely clear. Any proposed alternative needs a solid theoretical justification first.

7 Likes