Because it doesnâ€™t compare to well-implemented Hamiltonian Monte Carloâ€¦

There have been a variety of gradient-based methods introduced since the advent of Hamiltonian Monte Carlo, but because none actually capture the right structure of the target typical set none have been able to outperform Hamiltonian Monte Carlo.

In one-dimension Zig-Zag works by making translations that exactly preserve the target distribution. In order to generalize to higher-dimensional spaces one has to choose a single direction in which to make the translation. The baseline implementation chooses one of the diagonals, hence the resulting Markov chain looks like a â€śzig zagâ€ť through the parameter space.

The problem is that the typical set that the Markov chain needs to explore isnâ€™t straight â€“ in simple cases itâ€™s more spherical. Consequently these translations can only go so far before they would start to leave the concentration of high probability mass. As the dimension increases the translations get shorter and shorter, the direction sampling gets more chances to point back in the direction of the previous point, and the algorithm starts to look more like a diffusion.

The Bouncy Particle Sampler is another algorithm that turns out to be essentially equivalent to the Zig Zag sampler and runs into a similar issue (as the dimensions increase the bouncing starts to dominate over the translations in between bounces).

When experimenting with a new algorithm I recommend trying to find some motivation for *why* it should work better and then validate that motivation with experiments. The reality is that most algorithms will not scale and so a motivation might not exist, but the experiments will help build your understanding as to why.