Stan 2.10 through 2.13 have broken samplers

Bob_Carpenter · December 20, 2016, 4:05pm

As far as we can tell, Stan 2.09 is the latest version of Stan
with a properly functioning sampler.

Versions from 2.10 on are producing biased samples
that slightly underestimate posterior variance. Thanks to
Matthew R. Becker for filing the issue:

https://github.com/stan-dev/stan/issues/2178

Stan 2.10 changed the NUTS algorithm from using slice sampling
along a Hamiltonian trajectory to a new algorithm that uses
multinomial sampling:

https://arxiv.org/abs/1601.00225

We are mortified that after all of our nagging to get people
to use samplers that worked and weren’t biased, we released
a biased sampler. The 2.10 version had a major bug which was
easy to see and fix, but that apparently didn’t solve the
bigger problem.

Michael and I are poring over the proofs and the code, but
it’s unfortunate timing with the holidays here as everyone’s
traveling. We’ll announce a fix and make a new release as soon
as we can. Let’s just say this is our only priority at the moment.

Until then, the only thing I can recommend is using straight
up static HMC (which is not broken in the Stan releases)
or using something other than Stan or rolling back to Stan 2.09.

I’m not even sure how to do the latter for versions other than CmdStan,
which is just a source download and doesn’t require any
installation.

If all else fails, we’ll roll back the sampler to the 2.09 version
in a couple days.

Bob

betanalpha · December 20, 2016, 4:22pm

Let me temper the panic by saying that the bias is relatively small and affects only variances but not means, which is why is snuck through all our testing and application analyses. Ultimately posterior intervals are smaller than they should be, but not so much that the inferences are misleading and the shrinkage will be noticeable only if you have more than thousands of effective samples, which is much more that we typically recommend.

Static HMC seems to be giving valid results on the simple test problems that we are considering, but it still performs horribly on hard problems and so I would advise again using it seriously.

Bob_Carpenter · December 20, 2016, 5:14pm

I updated the blog post with Michael’s comment, which pretty much
matches what Andrew said.

I’m still mortified, not because bugs get through, but because this
is one we should be able to catch. On the plus side, we now have the
model to catch this in future regression tests (what computer scientists
call tests that make sure working behavior doesn’t “regress” to
a previous buggy behavior).

Bob

jonah · December 22, 2016, 5:45pm

From rstan you can set algorithm=“HMC” when calling stan(), but I would
trust Michael on this. That is, even with the bug NUTS should be better
than static HMC (except for some trivial cases).

betanalpha · December 22, 2016, 5:50pm

Correct.

syclik · December 27, 2016, 2:28am

Stan 2.14 is out now.

syclik · December 27, 2016, 2:30am

@betanalpha, I saw the unit tests. I don’t remember if there was a test added to help us catch this sort of bug in the future. If we didn’t add one with the pull request, could we add one now?

betanalpha · December 27, 2016, 4:33am

Look at the diff – two tests were added to catch the particular bugs addressed in the PR.

syclik · December 27, 2016, 4:51am

I remember those tests. I was thinking something end-to-end. We know what’s
correct in analytic models. We should know if we introduce something that’s
not correct.

Topic		Replies	Views
Old NUTS? Developers	11	1097	December 29, 2016
Choice of sampler General	3	462	December 11, 2018
How to evaluate samplers for inclusion in Stan? Developers	15	434	March 8, 2025
HMC (jittered) vs. NUTS on 1000-dimensional standard normal Algorithms mcmc	9	3930	April 29, 2019
How to produce unbiased samples with HMC method Modeling	8	839	July 22, 2018

Stan 2.10 through 2.13 have broken samplers

Related topics