The function tolerance argument in the algebraic solver does not work as expected


#1

(This is mainly for @charlesm93)

There’s something weird happening with the algebraic solver. I keep getting nonsensical error

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: algebra_solver: the norm of the algebraic function is: 1.05681e-06 but should be lower than the function tolerance: 1e-06. Consider increasing the relative tolerance and the max_num_steps.  (in '/Users/ds921/Downloads/gplainstan/gp_poisson_latent_la.stan' at line 44)

There are two things:

  • I think it should say “consider decreasing relative tolerance”
  • I looked at the code and lines 172-175 set the relative tolerance and the max step length, but don’t set the function tolerance. However, lines 186-195 throw an error if the function tolerance isn’t satisfied. Why isn’t the nonlinear solver trying to achieve both tolerances?

My expected behaviour from reading the docs is that the non-linear solver will try to satisfy both of these conditions, although it might fail.

At the very least, the manual should be updated to reflect that the solver isn’t trying to satisfy one condition with instructions on how to turn it off.

In the current situation, you need to have a solid upper bound for the norm of the function at the initial value (which is basically impossible if the function depends on parameters) to be able to set the relative tolerance in such a way that the absolute tolerance can be satisfied.


#2

@Daniel_Simpson If this isn’t sorted out, would you mind creating an issue in stan-dev/math if it’s a bug or stan-dev/stan if it’s just something we need to doc (if it’s just doc, it can just be a comment in the next manual issue).


#3

I don’t know which category this falls in.


#4

When you figure out if it’s doc or a bug, the only way to make sure it doesn’t get lost is to file an issue with a milestone and ideally someone assigned to it and labels indicating what it’s about.


#5

Hi all,

I’ll write an issue about this and provide a fix, when I come back from a weekend trip.

I think it should say “consider decreasing relative tolerance”

Yes, you’re absolutely right. Easy fix.

In the current situation, you need to have a solid upper bound for the norm of the function at the initial value (which is basically impossible if the function depends on parameters) to be able to set the relative tolerance in such a way that the absolute tolerance can be satisfied.

The function tolerance was not part of Eigen’s original algebraic solver. It’s a feature I added, because it struck me as an easy way to check the validity of the solution (kind of like an absolute tolerance). If a user doesn’t want to use this feature, they can set up a high value for the function tolerance. I can add clarification in the manual.


#6

Also, why was it necessary to restrict the initial vector to be data only? Can’t we just call value_of internally?


#7

I agree this is a limitation. Calling value_of internally should work.

The reasoning was the gradients don’t get computed with respect to the initial guess vector, so let’s pass it as a vector of data, but there are indeed many scenarios where this constraint is a problem.