Rainier currently provides two samplers: affine-invariant MCMC, an ensemble method popularized by the Emcee package in Python, and Hamiltonian Monte Carlo, a gradient-based method used in Stan and PyMC3.
Depending on your background, you might think of Rainier as aspiring to be either “Stan, but on the JVM”, or “TensorFlow, but for small data”.
As a rough comparison, Rainier seems to yield a 10x or more speedup relative to the equivalent Stan models. This is promising, though please keep in mind that benchmarking is hard, micro-benchmarks are often meaningless, and Stan’s sampler implementation is much more sophisticated and much, much, much better tested than Rainier’s!
… Within those constraints, however, it is extremely fast. Rainier takes advantage of knowing all of your data ahead of time by aggressively precomputing as much as it can, which is a significant practical benefit relative to systems that compile a data-agnostic model. It produces optimized, unboxed, JIT-friendly JVM bytecode for all numerical calculations. This compilation happens in-process and is fast enough for interactive use at a REPL.
Our implementation includes the dual-averaging automatic tuning of step-size from the NUTS paper, but requires you to manually specify a number of leapfrog steps. In the future, we plan to implement the full NUTS algorithm to also dynamically select the number of steps.