NUTS misses U-turns, runs in circles until max_treedepth

nhuurre · August 16, 2019, 5:53pm

The dot product becomes negative as soon as the angle between the vectors exceeds 90 degrees which is not a u-turn. To detect when the turn is near 180 degrees you need to compare the momenta not to each other but to some kind of a “pivot vector” that lies between them. I don’t think p1.(p1+p2)/2 is too strict. When sampling from a two dimensional IID normal Stan does limit about 15% of the transitions to a single leapfrog step so the current algorithm has no problem stopping at short u-turns. Limiting max_treedepth to 1 or even 2 seriously lowers effective sample size so I think even 15% is quite a lot.

If I restore the step sizes I get

\begin{eqnarray*} q_{n}-q_{1} & = & \frac{1}{2}\epsilon p_{1}+\sum_{k=2}^{n-1}\epsilon p_{k}+\frac{1}{2}\epsilon p_{n}+\epsilon^2\frac{f_{1}-f_{n}}{4}\\ & = & \sum_{k=1}^n\epsilon p_{k}-\epsilon\left(\frac{1}{2}p_{1}+\frac{1}{2}p_{n}\right)+\epsilon^2\frac{f_{1}-f_{n}}{4}\\ & = & \epsilon\sum_{k=1}^n p_{k}+O\left(\epsilon\right) \end{eqnarray*}

So no, the rho used by the u-turn criterion is a first order approximation to q_n-q_1 (or vice versa). It is the symmetrically shortened sum that is a second order approximation.

I can’t help but digress here. The above calculation suggests a nice visualization for the leapfrog trajectories. Positions q_k are represented by black dots in the obvious way. The force at position q_k is drawn as a vector that starts from q_k and has length \frac{1}{2}\epsilon^2 f_k . The momentum is then drawn as a vector \epsilon p_k placed such that its center coincides with the center of the corresponding force vector. It then happens that the momenta line up end-to-end and create a contiguous “spine” for the trajectory.

Circles are boring so I’ll use inverse square potential for the picture. (this potential is of course pathological due to the infinite peak in the middle and the improper flat tail but let’s not worry about those regions.)
Here is a trajectory for a two dimensional \frac{-1}{2\left|q\right|^2} potential starting at q=(1.5, 0) and p=(0, 0.66).

Symplectic geometry is more than just the position space. To see what the trajectory looks like in the momentum space move all the momenta to the origin while keeping the force vectors attached to their respective momenta. Now the forces create a contiguous path!

With all the relevant quantities in the same picture it’s possible to visualize different u-turn conditions. The original u-turn condition stops the sampler when the line connecting the first and last position makes an acute angle with at least one endpoint momentum.

uturn1

The condition in Stan 2.20 uses a line connecting the ends of the momentum spine.

uturn2

@fehiepsi proposed something like

uturn3

Indeed. I’m not even sure anymore if you can say it is more “complicated.” It has couple of variables more but their purpose is absolutely clear. Any proposed alternative needs a solid theoretical justification first.

Topic		Replies	Views
Best Practice: Max Treedepth During Warmup General	3	625	February 21, 2021
HMC (jittered) vs. NUTS on 1000-dimensional standard normal Algorithms mcmc	9	3843	April 29, 2019
Acceptance ratio in NUTS Developers	4	907	June 28, 2021
Using samples from the adaptation / warmup phase in NUTS General	8	1118	December 21, 2022
The role of "max_treedepth" in No-U-Turn? General fitting-issues	9	5712	September 6, 2021

NUTS misses U-turns, runs in circles until max_treedepth

Related topics