Newton solver in Stan

Stan’s current method for solving systems of (nonlinear) algebraic equations is Powell’s dogleg method, as implemented in Eigen. There have been a few examples where Newton’s method is, up to an order of magnitude, faster. Examples include Precise but wrong results for a drug interaction model and Algebraic sovler problems (Laplace approximation in Stan), plus some other experiment I conducted myself.

This warrants creating a new solver. I’ll detail out the difference between the two solvers, along with recommendations on when to use which. Before opening an issue on GitHub, I however wanted to start a conversation here. The way I see it, Newton’s solver would have the same signature as Powell’s solver. We may however want to add other options, such as an adaptive step size.

2 Likes

Is this a different Newton’s method than the optimizer we currently have? If it requires second derivatives, how do you propose to get them?

I don’t know who the best person is to talk to about these algorithms. If it were me, I’d ask @Marcus_Brubaker, @betanalpha, and @bgoodri.

Same idea but @charlesm93 is talking about something to use inside a Stan program rather than applied to a Stan program.

2 Likes

OK—any way we could use a common implementation. One of the things I disliked about the ADVI implementation was the embedded implementation of an optimizer. I’d prefer to pull out a single implementation to maintain, even if it needs slightly different interfaces for optimizing a Stan program and whatever the use is here.

1 Like

When solving an algebraic equation, Newton’s solver is a first-order method. It is a second order method for an optimization problem, because the algebraic equation is constructed by taking the derivative of the objective function.

I can look at the Newton solver used for ADVI and link it to the new solver.

+1! Charles: Maybe you and @betanalpha and me should get together next week and have a quick look at how we should build a more unified algorithms design (so that the non-HMC stuff is at least as well designed as the HMC stuff)

What do you mean by “second order” here? It still has quadratic convergence (locally) in both cases (which is the usual meaning of second order in numerical analysis)

Oups, mixed up some technical terms. I’m talking about the order of the derivative that needs to get computed.

And yes, happy to meet up next week, and work this, and other things out, with Michael.

1 Like

Cool! Sorry I got confused! I’m there on Wednesday and we can chat to work out a time then.

The “order” in this context has been confusing us for years!

Reviving this topic.

I had an email exchange with @yizhang regarding the Newton solver. He remarked the system currently used for ADVI works fine for small systems but does not support parallel or multiple schemes.

A more performant solver may be the one offered by KINSOL of SUNDIALS. Since we already have a dependency on SUNDIALS, this shouldn’t be too much overhead to add.

The documentation can be found here: https://computation.llnl.gov/projects/sundials/kinsol and I’m currently going through it.

I plan to assess the utility of a high-performance solver for three optimization problems: modeling steady states, finding a Laplace approximation, and optimizing the ELBO (for ADVI). If these are promising, we can write a spec, and raise a feature issue on GitHub.

2 Likes

In this case, the gradients are stochastic and I didn’t see any optimization methods for stochastic gradients mentioned in KINSOL or SUNDIALS.

Based on the documentation for SUNDIALS, here are interesting features of the KINSOL solver:

  • parallelization: KINSOL, as all SUNDIALS solvers, supports parallel vector operations. These can either be done using SUNDIAL’s or user provided parallel vector implementation. Note we would not need to parallelize the calculation of sensitivities, since the latter is done with the implicit function theorem, after a solution has been found. That said, I’m not quite sure which mechanism would be required to do a local operation in parallel, but this seems like something we could tackle with MPI.
  • Step size: the step size is either constant, set by the user, or adaptive, using a Goldstein-Armijo linesearch. I wrote an implementation of the latter when building a nested Laplace approximation, and didn’t observe any speedup; but how well the method works must depend on the specifics of the problem. This approach should be good when the initial guess is far from the correct solution, which is likely in high dimensions.
  • Gradients: The algorithm takes gradients of F specified by the user. So we can use autodiff here.

It seems all methods would apply to any problem. It wouldn’t be hard to implement a Newton solver with the last two. Parallelization is more tricky. Do we even use parallelized vector operations?

So it looks like the Sundial lob we current have in math (version 3.1.0) does not contain the KINSOL routine. According to Sundials’ website, the lastest release is version 4.1.0.

@wds15 has a pull request for updating to Sundials 4.0.1, but this version doesn’t seem to contain any src files with the KINSOL routine. In fact, the directories look quite different than the ones in v4.1.0. Is this to be expected? Are we using a specific (perhaps incomplete?) version of Sundials inside the math library?

Yes, we delete all things from sundials which we do not need at all for stan… that includes kinsol for now.

But if you are committed to use kinsol then we can include it in this pr; but I would ask you to work out the makefiles if that is ok for you. Should you not be used to writing our makefiles you can ask kindly the PR author to do that for as well…

in case you are still in the phase of evaluating things, then just work with a local kinsol copy for now.

Ok, this makes sense.

I still have some details to work out, as I’m much less familiar with Sundials than Eigen. In any case, I don’t want to delay your pull request. Ideally, we would get your PR through, and I can then add Kinsol with the PR for the new solver.

The PR started with Sundials 4.0.1 … until the review started we had 4.0.2 out and now there is 4.1.0… if you are fairly sure you will use KINSOL, I can really include it. The least I can do is include it, but don’t work out the makefiles yet.

Right now we anyway don’t know where to put auxiliary files which are stan/sundials specific - so this is an open question which takes time to settle.

Maybe before diving into linking kinsol you can put it next to Powell’s method and do a performance profile. Though I’d put Newton over Powell on any day given that Jacobian is available, it’s better to have some tangible idea of the performance gain, and that doesn’t require the overhead of putting kinsol into Math.

@yizhang Sounds like a plan. I hand-coded a Newton solver (super straightforward) and got performance gains when doing a nested Laplace approximation, something like a 10 fold speedup. I’ll replicate and extend the experiment with Kinsol.

@wds15 Ok, let’s confirm performance gain and, if warranted, you can add Kinsol. I can take a stab at the makefile.

I’m trying to work through some of KINSOL examples, but even the installation is giving me a hard time. The user manual prescribes using cmake, which apparently I don’t have on my computer – weird. I tried installing it using homebrew, but got the following error message:

It appears you have MacPorts or Fink installed.
Software installed with other package managers causes known problems for
Homebrew. If a formula fails to build, uninstall MacPorts/Fink and try again.

and a broken link to a page on troubleshooting: https://git.io/brew-troubleshooting [Broken]

I can try what the warning message prescribes, but I’m wondering if there is a slicker way of doing this. Ideally, I’d want to solve an algebraic equation with KINSOL from a unit test in Stan.

Here’s the full error message:

File "/Users/charlesm/anaconda/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 854, in resolve
pkg_resources.DistributionNotFound: The 'Sphinx==1.3.1' distribution was not found and is required by the application
make[2]: *** [Utilities/Sphinx/doc_format_man] Error 1
make[1]: *** [Utilities/Sphinx/CMakeFiles/documentation.dir/all] Error 2
make: *** [all] Error 2
Warning: It appears you have MacPorts or Fink installed.
Software installed with other package managers causes known problems for
Homebrew. If a formula fails to build, uninstall MacPorts/Fink and try again.

READ THIS: https://git.io/brew-troubleshooting

/System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:359:in `open_http': 422 Unprocessable Entity (GitHub::Error)
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:737:in `buffer_open'
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:212:in `block in open_loop'
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:210:in `catch'
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:210:in `open_loop'
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:151:in `open_uri'
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:717:in `open'
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:31:in `open'
	from /usr/local/Library/Homebrew/utils.rb:436:in `open'
	from /usr/local/Library/Homebrew/utils.rb:466:in `issues_matching'
	from /usr/local/Library/Homebrew/utils.rb:498:in `issues_for_formula'
	from /usr/local/Library/Homebrew/exceptions.rb:209:in `fetch_issues'
	from /usr/local/Library/Homebrew/exceptions.rb:205:in `issues'
	from /usr/local/Library/Homebrew/exceptions.rb:248:in `dump'
	from /usr/local/Library/brew.rb:160:in `rescue in <main>'
	from /usr/local/Library/brew.rb:67:in `<main>'

Tagging @seantalts.

Ok, so the actual error is related to installing the Sphinx ==1.3.1 distribution.