I have trouble turning all of them into action items however.
Like, I know if I look at a variable, and it’s clearly some large, positive number, that relative tolerance is the thing that will matter for it.
Likewise, I know if a variable will be near zero, absolute tolerance will matter. So I can somehow translate scale into things to do in my workflow.
I suppose I can look for divide by zeros and such in the right hand side to find singularities.
However I am still stuck making a decision between RK45, Adams, and BDF for purposes of accuracy, stability, and order requirements. I might be able to pick a tolerance for any given solver, but not the solver itself.
Why is order requirement separate from accuracy?
Well, my suspicion is that people don’t do this manually. Like I don’t even know what the stability region of the CVODES BDF solver would be anyway! I’d have to spent a lot of time with an ODE book and the CVODES doc.
Also I suspect that sampling from a prior and seeing if the solver works is a more reliable method of finding problems that trying to analyze a linearization or something.
I’m more thinking outloud here than anything else. Not trying to come to any conclusion, just trying to get the puzzle pieces to fit on what an ODE workflow might look more like.
There’s also the difficulty of solving ODEs outside of Stan vs. inside of Stan. I wonder if it is difficult for people to generalize what they’ve learned about ODE solvers outside of Stan into Stan? Like do they spend all their time frustrated we haven’t documented our methods better, etc.? Or is it all fine?
\|y_n-y(t_n)\| = Ch^{p+1}y^{(p+1)} + O(h^{(p+2)}) for an order-p method, to improve accuracy one can either increase p or reduce h(stepsize).
The thread seems wondering off the topic of AM solver. Maybe we can do it in a new one. In general I’m fine with the idea of picking solver based on test runs with priors, it’s just I don’t see how it can be a robust and quantifiable process.
It comes back to Adam’s Moulton cuz me and @wds15 both tried it on non-stiff methods and it was slow, so we were missing something.
Sounds like rk45 had good stability properties on the systems we tried things on and so it was fast (and implicit method unnecessary). I don’t know how to know this beforehand.
We’ll need a better term than “stability” because it could mean several things. Anyways, in general rk45 is a method choice if one walks into the room knowing nothing about the ODE, and often our ODE are very nice and not even close to stiffness. However, rk45 is not known for stability but
medium order(4/5)
relatively efficient(FSAL + local interpolation)
two orders so friendly to stepsize adjustment(auto step)
and, this is what I suspect why most users choose it
being explicit.
Now for stability, rk45 is inferior to rk23(Bogacki–Shampine) but better than Adams-Bashford(higher-order but much less stable), and I’d say if we replace rk45 with Adams-Bashford or even forward-Euler, it’s still going to be people’s favorite because these explicit methods almost always give you something(max_num_step reached! so what? increase it!).