Newton solver in Stan

The test causing the error is

theta = degenerate_test(y_scale, x_guess_3, is_newton);

so the last test with the initial value at a saddle point. The others tests don’t cause an issue. AD consistently returns flag -11. With QD, you alternate between flag -5 and -11.

I can split the unit tests. It seems too much is going on in every test, and now that we have TEST_F there’s no advantage in grouping tests.

Please do. Because I just ran the following with QD (see https://github.com/yizhang-yiz/math/tree/yz_newton)

TEST_F(degenerate_eq_test, newton_dbl) {
  Eigen::VectorXd theta = degenerate_test(y_dbl, x_guess_1, true);
  EXPECT_FLOAT_EQ(8, theta(0));
  EXPECT_FLOAT_EQ(8, theta(1));
}

TEST_F(degenerate_eq_test, debug) {
  Eigen::VectorXd theta = degenerate_test(y_scale, x_guess_3, true);
  EXPECT_FLOAT_EQ(100, theta(0));
  EXPECT_FLOAT_EQ(100, theta(1));
}

and I got

bash-3.2$ test/unit/math/rev/mat/functor/algebra_solver_newton_debug_test
Running main() from lib/gtest_1.8.1/src/gtest_main.cc
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from degenerate_eq_test
[ RUN      ] degenerate_eq_test.newton_dbl
unknown file: Failure
C++ exception with description "algebra_solver failed with error flag -5." thrown in the test body.
[  FAILED  ] degenerate_eq_test.newton_dbl (1 ms)
[ RUN      ] degenerate_eq_test.debug
unknown file: Failure
C++ exception with description "algebra_solver failed with error flag -11." thrown in the test body.
[  FAILED  ] degenerate_eq_test.debug (0 ms)
[----------] 2 tests from degenerate_eq_test (1 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (1 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] degenerate_eq_test.newton_dbl
[  FAILED  ] degenerate_eq_test.debug

 2 FAILED TESTS

which explains why you’ve seen different exceptions thrown.

1 Like

In the updated unit tests you get flag -11 with both AD and QD, for TEST_F(degenerate_eq_test, newton_guess_saddle_point_dbl).

Right now the unit tests don’t pass, but given the non-convexity I think that’s fine. Note that the powell solver converges to a solution (which depending on the problem at hand is good or bad).

The current commit won’t pass the unit tests, but if we’re ok with commenting this test on the Newton solver, I can do that.

I’m fine with newton fails on this test, as long as we have consistent error between AD & DQ, which we do now.

The problem is not convexity, it’s null initial point gives singular Jacobian. Actually if you perturb the initial point just a little bit I believe newton would be fine as line search should be able to get the iteration started. Powell method can move the trials around because it’sa trust-region method that doesn’t use Jacobian.

It’s worth to include this particular test and failure msg in the unit test.

1 Like

Yes, sorry, that’s what I had in mind.

I think all the points in your review should be addressed by now. Once we have that PR in, I’ll write the Stan interface. If memory serves well, the next Stan release is mid-October.

Thank you. Could you take a look of the PR and make sure all the conversations are resolved?

all resolved.

1 Like

With this PR in, I was able to add fixed point solver PR, for problems of form x=f(x;\theta). The iteration is simply x_{i+1}=f(x_i; \theta), but the method has been shown out-perform Newton in cases where

  • f is a proper contraction mapping
  • Calculating Jacobian is expensive
  • The fixed point iteration is implemented with Anderson acceleraton(included in KINSOL).

One use case is in PKPD when we seek “steady state solution” of an ODE system, namely x^{\ast} such that

x^{\ast} = x(x^{\ast}; \theta)

where x is the flow corresponding to the ODE solution with x^{\ast} being the initial condition and \theta the parameter.

Since @charlesm93 has done newton & dogleg, I’d like to ask you to take a look and possibly review it. Thanks.

2 Likes

Excellent stuff, I’m on it.

If you have some performance tests, it would be good to share the results on discourse. Something akin to what I posted earlier.

It would be great to have experiments and theory guiding a user to know when to prefer which, even if just heuristics. For example does a user start with one of the algorithms and switch to the other if there are problems? What structure of f defines it as a contractive mapping and how can a user identify that structure empirically?

Good point, and I’d like to expand the scope to functions such as ODE solvers, as many users would probability want to know heuristics on things like stiff vs non-stiff. Unfortunately none of these questions are trivial or easy to resort to heuristics.

Of course, but I think we need some motivation/context that is informative to the average user before adding a new feature. If only to define a clear default and what behaviors might suggest switching to something else.

The ODE solvers are somewhat confined to power users who may know some of this already, but the algebraic solver has a myriad of statistical uses (setting priors from quantiles, inverting CDFs, etc) that will make it more widely used.

I see your point. I can add description on the method in the current FP PR. The question

needs to be addressed in a coordinated effort when more detailed documentation of dogleg & newton become available. I can see scope creep on the horizon as we are going to have sparse matrices and sparse solvers(hopefully soon).

I’ve added two unit tests to demo FP vs Newton:

  • Solve the earlier degenerated problem that Newton has problem with same starting point through an algebraic-equivalent fixed point form. The iteration converges to (5, 5) root:
Iter x_1 x_2
0 5 100
1 5 1905
1 5 5
  • Solve a large system of size n with m=n parameters y, i=1\dots m:
f(x_i) = e^{y_{i-10}x_{i-10}}e^{y_{i-9}x_{i-9}}\dots e^{y_{i+9}x_{i+9}}e^{y_{i+10}x_{i+10}}, \forall i=10\dots n-10\\ f(x_i) = e^{y_{i}x_{i}}, \forall i<10 \text{ or } i > n-10

Wall time(ms) when n=400:

FP Newton
109 289
2 Likes

@charlesm93

I am sorry to enter into this conversation. So, do I need to open a new tread?
This is my first time using algebra_solver_newton in Rstan.
Can you help me to solve this error?

Chain 1 Exception: Exception: algebra_solver failed with error flag -7.

Thank you

Hi @evihas ,
That sounds like an error message produced by KINSOL, the C++ library that supports the Newton solver. From the user manual:

Five consecutive steps have been taken that satisfy a scaled step
length test.

A link to what these flags mean (for KINSOL and the CVODES solver) can be found here: SUNDIALS constants - sundials.

Early replies buried in this thread discuss this error, see Newton solver in Stan - #36 by yizhang. Basically the steps the Newton solver seems is taking are very large compared to the norm of your initial guess, which suggests the solver may not be converging. If you use an initial guess with norm 0, this can inadvertently cause the issue.

2 Likes

I will just leave this here, the direct (and slightly more interpretable) link to SUNDIAL errors. I should make myself a bingo card. Can check off -5, -7 and -11 so far.

1 Like