Newton solver in Stan

Note that with exact Newton one still has to choose between AD and what sundials call DQ method to calculate Jacobian. At this point without knowing much about use case I suggest we use default DQ.

For PKPD steady state problems I’ve tested DQ is slightly faster.

Out of curiosity, if kinsol is too much, what is missing from Sundial’s sunnonlinsolnewton internal library?

DQ is difference quotient, which is finite differentiation. I expect the method is less efficient and numerically accurate than AD, especially in high-dimensional cases. We can test that. I usually look at two cases (high-dimensional convex optimization and pmx steady state). Can you send me your test code?

The other thing is using fwd or rev mode, depending on the dimension of the input and output. This is something we can automate. But maybe a feature for a future PR. It would be better to do this one step at a time, to avoid further delays.

The Jacobian computation question aside, I propose two sets of optional tuning parameters. Available signatures would then be:

algebra_solver_newton(system, y_guess, theta, x_r, x_i);

algebra_solver_newton(system, y_guess, theta, x_r, x_i, rel_tol, 
                      abs_tol, max_steps);

algebra_solver_newton(system, y_guess, theta, x_r, x_i, 
                      rel_tol, abs_tol, max_steps,
                      line_search, inexact_jacobian);

In my view, these options remain fairly manageable. The first PR will cover the first two options, then I can add the third signature in a second PR. Hopefully, we’ll get feedback from users on what they need.

I didn’t even look at this option. Kinsol is advertised as Sundial’s method for solving nonlinear algebraic systems. Since it has more features that we’ll potentially use in the future, I’d rather go ahead and implement it.

You can simply comment out the Jacobian function line in CVODES solver to use DQ and test performance. The mechanism is the same with kinsol.

Finite difference has bad rap, but sometimes that’s all we need.

-YZ

@yizhang I ran two tests that corroborate your results. Both methods give the same solution, but QD is faster. However, I have one test where QD fails, with flag -5 (“the linesearch algorithm was unable to find an iterate sufficiently distinct from the current iterate.”), while rev AD works fine.

For steady states (one compartment, but solved for 100 patients, so 200 states) QD is about twice as fast. For a convex optimization problem, based on a Laplace approximation, with dimension 100, QD also gives me a speedup (form 0.174s to 0.134s). But QD fails with dimension 500. Sometimes, Kinsol’s defaults trade precision for speed, and too much so, though I haven’t nailed down what is exactly happening here.

1 Like

I ran further tests: I changed the dimension, simulated new data with the same dimension, and could not reproduce the flag -5 error. It could be a “fluke”. Any further insight is welcome. What I have been getting consistently is that QD is faster.

I’ll set QD as the default, but I’ll keep intact the code for rev mode AD.

I guess this is due to QD having higher chance to have problems with finite accuracy of floating point representation.

It’s good to keep AD around, so that each time you get -5, you can test whether AD works for that case.

It’s good to keep AD around, so that each time you get -5, you can test whether AD works for that case.

In Stan Math, I wrote a function called Kinsol_solve() that gives you control over a lot of arguments. You can notably specify whether to use QD or rev mode AD. With a little bit more work, I could have it take in any method specified by the user.

algebra_solver_newton is built on top of this, and brings all the usual checks we have in Stan. Additional controls are not exposed to the Stan language, but this can be added (see my post here). My goal is to proceed one PR at a time.

I’m now running all our unit tests (see test/unit/math/rev/functor/algebra_solver_test.cpp) on the Newton solver and I’ve gotten the flag -5 error for a rather simple case. Namely, solve for x:

\begin{aligned} x_3 - \theta_3 = 0 \\ x_1 x_2 - \theta_1 \theta_2 = 0 \\ x_3 / x_1 - \theta_2 / \theta_1 = 0 \end{aligned}

This error is controlled by the scale step tolerance, which conservatively, I’ve set to 10^{-3}. If I use Kinsol’s default, \mathrm{uround}^{2/3}, the algorithm converges but the sensitivities are slightly off (the error being 10^{-7} for a value that should be 1). If I use AD instead of QD, I do not get the error.

AD is certainly the more conservative option here. I’m inclined to make it the default, and later add QD as an option the user can try to get some speedup.

1 Like

My guess is the Jacobian is ill-conditioned for certain \theta_{1,2,3} in this example. If this is the case, one may want to apply preconditioner rather than modify scale tolerance. But as we discussed if we want to limit the number of knobs in Stan function we will have to choose the conservative option. On the other hand if we allow user to choose AD vs DQ as @avehtari suggested(also my preference), we’ll have to rethink the function signature design.

1 Like

Ok, one PR at a time. The first version will only give minimal control to the user, and we can then expand to a more “pedantic” mode.

The first PR is up. Looking for reviewers: https://github.com/stan-dev/math/pull/1339.

2 Likes

I picked it up.

1 Like

Wanting to test it. I struggled with check-out. Could you help me?

andre@ubuntu:~/cmdstan$ make stan-update/feature/issue-1115-newton_solver
git submodule update --init --recursive
cd stan && git fetch --all && git checkout feature/issue-1115-newton_solver && git pull
Fetching origin
remote: Enumerating objects: 894, done.
remote: Counting objects: 100% (894/894), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 1075 (delta 855), reused 871 (delta 853), pack-reused 181
Receiving objects: 100% (1075/1075), 324.78 KiB | 0 bytes/s, done.
Resolving deltas: 100% (909/909), completed with 615 local objects.
From https://github.com/stan-dev/stan
   64dae6b5e..b24a112dc  develop               -> origin/develop
 * [new branch]          fix-jenkinsfile-clang -> origin/fix-jenkinsfile-clang
 * [new branch]          refactor/array-context-and-reader -> origin/refactor/array-context-and-reader
 * [new branch]          revert-2805-cleanup/clang-format -> origin/revert-2805-cleanup/clang-format
 * [new branch]          revert-2808-cleanup/clang-format -> origin/revert-2808-cleanup/clang-format
 * [new branch]          revert-2809-fix-jenkinsfile-clang -> origin/revert-2809-fix-jenkinsfile-clang
 * [new branch]          revert-revert/clang-tools -> origin/revert-revert/clang-tools
Fetching submodule lib/stan_math
From https://github.com/stan-dev/math
   27111a2b97..54277ffcc5  develop    -> origin/develop
   a8f54ded99..332fb86e65  feature/issue-755-laplace -> origin/feature/issue-755-laplace
error: pathspec 'feature/issue-1115-newton_solver' did not match any file(s) known to git.
makefile:197: recipe for target 'stan-update/feature/issue-1115-newton_solver' failed
make: *** [stan-update/feature/issue-1115-newton_solver] Error 1

that won’t work - the upstream archives do not know about that feature.

You have to

  1. checkout math at the branch
  2. checkout cmdstan at develop (should work) - do that with recursive subtree checkout
  3. set in make/local of cmdstan the MATH variable to point to the stan-math with the branch

… but to use the new stuff you have to write a c++ function to get access to the new thing… or you hack the branch such that the old algebra_solver function uses the new function instead.

(all of this is really early stuff, so expect issues; but those are probably useful information)

2 Likes

I suggest test after jenkins building & testing success. I’m currently waiting @charlesm93 for fix it.
https://jenkins.mc-stan.org/blue/organizations/jenkins/Math%20Pipeline/detail/PR-1339/4/pipeline

1 Like

Thanks for the pointer. Hopefully, it is now fixed.

Wanting to test it.

You can test it on Stan Math. If you want to use it in the Stan language, you’ll need to wait for the PR in the Stan language. I’ll write this one once we have the Math PR merged.

@stevebronder I saw you made changes to the algebraic_solver, which created a conflict. A fix was straightforward, but can you summarise / point me to a place that summarises why you made those changes? I may need to propagate them to the new Newton solver.