Addressing Stan speed claims in general

ChrisRackauckas · August 5, 2020, 11:11am

I think what should be addressed there is that Stan is a bit of a closed world, which has its good and bad points. The good point is that it’s much easier to know what the universe of available functions are which makes it easier to document. The downside is that you don’t have an ecosystem with 10,000 developers contributing functionality. With Julia’s Turing.jl, we from SciML were demonstrating that Turing.jl + Julia’s DifferentialEquations.jl could be much more performant than Stan, which is not an indictment of Stan and its AD practices, but rather it’s an indictment of the ODE solvers that were chosen. From looking at the files, it looks like there’s really just two people maintaining that portion, and it’s just calling into Eigen and Sundials. While those are respectable solver libraries, we have shown in the SciMLBenchmarks repeatedly that there are usually alternative solvers that are more efficient than dopri5, adams, and bdf (those are quite old methods! None of them are even A-stable above order 2!), among other issues like Sundials adams is quite finicky in terms of reaching tolerance.

So our benchmark demonstrating advantages of Turing+DifferentialEquations.jl over Stan (which BTW were not performed by the Turing developers) is more about demonstrating what happens when you mix lively developer ecosystem (~20 active developers across SciML at a given time) with a full lively PPL ecosystem. I don’t think anyone has really sat down and optimized exactly how the ODE solvers of Stan will work on different subsets of problems, but with DifferentialEquations.jl there’s >300 solvers covering different cases in optimal ways, along with adaptive high-order implicit methods for SDEs, adaptive high order implicit methods for DDEs, etc. These are all extremely well tested (the benchmarks are fairly thorough, are updated roughly monthly through automated systems, and the CI takes >50 hours total if you count all of the convergence and regression testing going on). The Turing.jl community gets this for free by integrating with a language-wide AD system.

When we are timing Stan vs Turing in terms of differential equations, this kernel advantage is what’s demonstrated, and again it’s not an indictment of Stan but rather it’s an issue of focus, expertise, and language: it doesn’t look like a horde of Stan developers wake up every morning to optimize differential equation solvers and improve their accuracy, but in Julia that has been the case for a fairly long time now.

I think the main issue with Stan in this department is really just that the kernels are all baked in. Seeing an FFI (Foreign Function Interface in Stan's future) would be nice, because then external developers like us could add new differential equation solvers to Stan by making it call into Julia. There’s nothing wrong with Stan’s design or Stan’s mission, it’s just not a mission that means you have 10 choices of GPU-accelerated preconditioners for a CFD simulation, and so Stan will not benchmark in those regimes simply due to the implemented kernels. PPLs like Turing have a different mission of allowing a whole general-purpose language be extended in probabilistic ways and integrate with language-wide automatic differentiation, which in turn makes it do well on things which are not part of the traditional sphere of Bayesian inference.

It’s not an advertising problem. Unless the docs and the Github repo are mistaken, Stan doesn’t have a DDE solver, so External delay differential equation solver is a real issue to somebody. It doesn’t have an SDE solver, so Reverse mode SDE idea is a real issue to somebody. It probably doesn’t intersect with the vast majority of Stan users (Stan users probably pre-select away from being the core differential equation modeling crew), but I think calling widely used and effective models as just pure hype is a bit disingenuous to people’s research that doesn’t fall into the currently available Stan kernels.

I think it’s fine to just say that doesn’t fall into the general mission of Stan: Stan does what it does well, and people respect that.

Topic		Replies	Views
Stan backend for NumPyro + performance comparison Publicity	10	3247	January 17, 2021
Call for possibly-under-documented Stan tricks for benchmarking project Modeling techniques , specification , performance	0	425	May 27, 2021
Stan Micro-consulting Request General	0	378	December 13, 2018
Surveying the Stan user community General	9	738	March 12, 2020
Stanc3 optimization and analyses walkthrough during StanCon Meetings	6	1073	August 22, 2019

Addressing Stan speed claims in general

Related topics