Note that with exact Newton one still has to choose between AD and what sundials call DQ method to calculate Jacobian. At this point without knowing much about use case I suggest we use default DQ.
For PKPD steady state problems I’ve tested DQ is slightly faster.
DQ is difference quotient, which is finite differentiation. I expect the method is less efficient and numerically accurate than AD, especially in high-dimensional cases. We can test that. I usually look at two cases (high-dimensional convex optimization and pmx steady state). Can you send me your test code?
The other thing is using fwd or rev mode, depending on the dimension of the input and output. This is something we can automate. But maybe a feature for a future PR. It would be better to do this one step at a time, to avoid further delays.
In my view, these options remain fairly manageable. The first PR will cover the first two options, then I can add the third signature in a second PR. Hopefully, we’ll get feedback from users on what they need.
I didn’t even look at this option. Kinsol is advertised as Sundial’s method for solving nonlinear algebraic systems. Since it has more features that we’ll potentially use in the future, I’d rather go ahead and implement it.
@yizhang I ran two tests that corroborate your results. Both methods give the same solution, but QD is faster. However, I have one test where QD fails, with flag -5 (“the linesearch algorithm was unable to find an iterate sufficiently distinct from the current iterate.”), while rev AD works fine.
For steady states (one compartment, but solved for 100 patients, so 200 states) QD is about twice as fast. For a convex optimization problem, based on a Laplace approximation, with dimension 100, QD also gives me a speedup (form 0.174s to 0.134s). But QD fails with dimension 500. Sometimes, Kinsol’s defaults trade precision for speed, and too much so, though I haven’t nailed down what is exactly happening here.
I ran further tests: I changed the dimension, simulated new data with the same dimension, and could not reproduce the flag -5 error. It could be a “fluke”. Any further insight is welcome. What I have been getting consistently is that QD is faster.
I’ll set QD as the default, but I’ll keep intact the code for rev mode AD.
It’s good to keep AD around, so that each time you get -5, you can test whether AD works for that case.
In Stan Math, I wrote a function called Kinsol_solve() that gives you control over a lot of arguments. You can notably specify whether to use QD or rev mode AD. With a little bit more work, I could have it take in any method specified by the user.
algebra_solver_newton is built on top of this, and brings all the usual checks we have in Stan. Additional controls are not exposed to the Stan language, but this can be added (see my post here). My goal is to proceed one PR at a time.
I’m now running all our unit tests (see test/unit/math/rev/functor/algebra_solver_test.cpp) on the Newton solver and I’ve gotten the flag -5 error for a rather simple case. Namely, solve for x:
This error is controlled by the scale step tolerance, which conservatively, I’ve set to 10^{-3}. If I use Kinsol’s default, \mathrm{uround}^{2/3}, the algorithm converges but the sensitivities are slightly off (the error being 10^{-7} for a value that should be 1). If I use AD instead of QD, I do not get the error.
AD is certainly the more conservative option here. I’m inclined to make it the default, and later add QD as an option the user can try to get some speedup.
My guess is the Jacobian is ill-conditioned for certain \theta_{1,2,3} in this example. If this is the case, one may want to apply preconditioner rather than modify scale tolerance. But as we discussed if we want to limit the number of knobs in Stan function we will have to choose the conservative option. On the other hand if we allow user to choose AD vs DQ as @avehtari suggested(also my preference), we’ll have to rethink the function signature design.
that won’t work - the upstream archives do not know about that feature.
You have to
checkout math at the branch
checkout cmdstan at develop (should work) - do that with recursive subtree checkout
set in make/local of cmdstan the MATH variable to point to the stan-math with the branch
… but to use the new stuff you have to write a c++ function to get access to the new thing… or you hack the branch such that the old algebra_solver function uses the new function instead.
(all of this is really early stuff, so expect issues; but those are probably useful information)
Thanks for the pointer. Hopefully, it is now fixed.
Wanting to test it.
You can test it on Stan Math. If you want to use it in the Stan language, you’ll need to wait for the PR in the Stan language. I’ll write this one once we have the Math PR merged.
@stevebronder I saw you made changes to the algebraic_solver, which created a conflict. A fix was straightforward, but can you summarise / point me to a place that summarises why you made those changes? I may need to propagate them to the new Newton solver.