I am wondering how mature this all is and what plans are to integrate it into the stan repositories (stan algorithm, MPI subsystem, MPI & threading)? This may sound impatient, but the result are seemingly very nice such that the community will benefit a lot from it.
New warmup strategies will likely need a bit of alignment here and there which will be quite a process to go though, but it appears as if the merits from this work are absolutely worthwhile going that mile.
I can certainly help/comment on the threading/MPI bits.
(bigger changes like this are hard to carry through - I speak out of my own recent experience)
IMO the best way one can facilitate this effort now is to play with it on his models. Once we have enough confidence on the algorithm we can move on to implementation details.
The final goal is to provide user principled guidance on how to use it. Along the way we’ll need to identify, for example, default values for target ess & rhat, as well as optimal way of aggregating stepsize & metrics. Despite some success of the algorithm, it could also fail on simple models. Take eight school model for instance, the proposed algorithm could have suboptimal metric/stepsize and significantly more divergence. The following summary is based on a rather large target_ESS(=800) and comparison against same num_warmup(=600) regular stan runs.
The performance on sblrc-blr from posteriordb looks promising. This is not from cherry-picking a nice-looking run but consistent outcome. Among the model I’ve tested this one shows the most significant improvement of ESS.
Not sure on the question. Some benchmark I’ve run shows the ESS I see is consistent with that from standard runs. Depends on tuning parameters of the algorithm it could be higher or lower.
Can I dig this up to ask @yizhang and @bbbales2 what kind of speedup (wall time till sampling starts) and efficiency gains (total #leapfrog steps during warmup) one can expect from this? Mainly for ODE models, but also for other models? I’d say only difficult models are interesting, ie models where warmup can take quite some time.
Edit: I guess it would be easiest to just try it myself. However, apparently I need a more recent version of stanc/math than included in the linked repo. I guess @stevebronder and @wds15 are actively working on the above algorithm? What’s the best way to get your working copy, and is it the same algorithm as proposed/evaluated in this thread?