Algebra solver details?

features

#1

Where can I find details of the algorithm used in the algebra solver? I found the new section in the stan-reference, but it doesn’t mention anything about the algorithm (I added a issue comment about this). Quick look at the github didn’t help either. I’m asking if it would be good for solving
0 = f - K ∇ log p(y | f)
where f is a vector of n latent values, K is a n x n covariance matrix, ∇ log p(y | f) is a vector of first derivatives of the likelihood wrt f, and usually n>1000. Usually this is solved with Newton’s method, but I’m guessing that the algorithm in algebra_solver probably would not be as good for this specific problem?

And for those who don’t know yet, solving the above equation finds the posterior mode for the latent values in Gaussian latent variable model with a non-Gaussian likelihood, which could be used as part of the Laplace approximation for GMRF’s and GP’s with non-Gaussian likelihoods. Extra cool would be possibility to give the log likelihood as an argument and let the autodiff compute the first derivative ∇ log p(y | f).

Aki


#2

Hi Aki,
The algorithm is Powell’s hybrid method, also called the dogleg method. The C++ implementation is based on an unsupported module in Eigen’s library for nonlinear optimization problems, see NonLinearOptimization/HybridNonLinearSolver.h. This module is itself based on the MINPACK-1 package.

The Jacobian is computed using the implicit function theorem. This involves computing some intermediate jacobians with auto-diff along the way.

I’m not sure how well this works for the problem you describe in comparison to Newton’s method. I’d be curious to learn more and I’m happy to help.

I’ll write a pull-request to add some references to the doc.


#3

Thanks! Powell’s hybrid method can be implemented with different variations, but I found this description for the MINPACK version https://www.gnu.org/software/gsl/manual/html_node/Algorithms-using-Derivatives.html
which provided the details I wanted. In a summary " The Hybrid algorithm retains the fast convergence of Newton’s method but will also reduce the residual when Newton’s method is unreliable." Thus, it should be useful for the case I mentioned, and would be also safe with non-log-concave likelihoods (which may have other problems with Gaussian approximation, but that’s another story).

Aki


Algebraic sovler problems (Laplace approximation in Stan)
#4

I think the current document version has an error in the first example
real[] x_i) { // data (integer)
it would make more sens to have int[] x_i)

Aki


#5

The document says

A system of algebraic equations is coded directly in Stan as a function with a strictly specified signature.

I assume that the function name is fixed, but are the argument names also fixed?

Aki


#6

I think the current document version has an error in the first example
real[] x_i) { // data (integer)
it would make more sens to have int[] x_i)

Good catch.

I assume that the function name is fixed, but are the argument names also fixed?

Neither the function name, nor the argument names are fixed. What’s fixed are the argument types and the return type. So the following would be valid:

real[] foo(real[] y, 
           real[] theta,
           real[] x_r,
           int[] x_i) { ... }

real[] algebra_system(real[] unknown, 
                      real[] parms,
                      real[] dat,
                      int[] dat_int) { ... }

However, this wouldn’t work:

real foo(real[] y,
         real[] theta,
         real[] x_r,
         int[] x_i) { ... }

because the return type is wrong. These constraints are, in nature, the same you have when you define an ODE system.


#7

I was asking this because the examples in the document don’t have anything referring to the name of the function:
y = algebra_solver(y_guess, theta, x_r, x_i);
Now I found from p. 474
real[ ] algebra_solver(function algebra_system, real[] y_guess, real theta, real[] x_r, int[] x_i)
so it seems the examples (p. 261, 262) should also have the function name as the first argument?

Aki


#8

That’s a huge blunder on my part. No idea how I missed this. You right, it should be

y = algebra_solver(f, y_guess, theta, x_r, x_i);

#9

The document example has
vector system(vector y, ...
but the p. 474 has
real[ ] algebra_solver(function algebra_system, real[] y_guess,
Sorry for my ignorance, but is a vector allowed if the signature is real[]?

Aki


#10

Hi Aki,

I got a little mixed up. The correct signature is:

vector foo (vector y,
            vector theta,
            real[] x_r,
            int[] x_i) { ... }

vector and real[] are not interchangeable. In general, a real[] acts as a container, while a vector is used for matrix operations.

You have to go under the hood to see y and theta are used for vector operations, while x_r and x_i only act as containers. I get confused because I’m used to the ODE integrator where y and theta are specified as real[]. But for the algebraic solver, they should be vectors.

Anyways, I’m writing a pull-request to correct the doc as you suggested; and I’ll double-check there are no errors.