Exposing quadratic optimizer

I’m trying to expose a higher-order function (the quadratic optimizer with linear constraints, see https://github.com/stan-dev/stan/issues/2556). I followed closely what was done for algebra_solver and then ran an already existing unit test, using:

./runTests.py src/test/unit/lang/parser/matrix_functions_test.cpp

The goal was to see if the code would compile. It doesn’t. The error message is in the attached file was return. error_message.txt (103.5 KB)

Ok. Both errors point to the same line of code, which I wrote: term_grammar_def.hpp:287:9: As far as I can tell, this line is fine and the error lies somewhere else. I’m not quiet sure where and how to look. After going through the files I made a few corrections, but none that fixed the above error. Note there is a total of 21 edited files.

Does anyone more familiar with the language parser have an idea about what may be causing the error? Thank you!

Sorry I missed this, @charlesm93. If you haven’t sorted it out yet, could you push a branch and email me? I can check it out.

Hey Bob,
branch is pushed and linked in my original post. I haven’t worked it out yet. I might give it a stab on the airplane.

You linked the issue. The branch is feature/issue-2556-quadratic_optimizer.

I’d recommend simplifying things. Remove the semantic actions to make sure the FUSION declarations and grammar rule types are all correct. Then compile the function and make sure that works. Then try to tie it into the grammar. It’s painful, but it’s the only way we were ever able to debug these Boost Spirit Qi grammars (in @syclik’s words, you want to take “baby steps”).

So I checked out your branch and removed the quadratic optimizer semantic action declaration, definition, and call from the grammar and everything compiles other than generate_expression because the case isn’t defined for the quadratic optimzer. Fix that (with a stub), then start with a stub for the semantic action to get it to compile. Then start filling it in piece by piece.

1 Like

Following Bob’s advice to break things down, I was able to fix a myriad of small errors which ultimately caused the bug. The changes were committed and the code now compiles.

It however seems I’m still missing a piece to expose the function. See the following unit test src/test/unit/lang/parser/quadratic_optimizer_test.cpp, according to which quadratic_optimizer is not a known function. Which file would the parser use to check if a higher-order function exists? Do I perhaps need to recompile certain files?

Tagging @mitzimorris.

checked out branch feature/issue-2556_etc
but looks like you haven’t added (or committed or pushed) file
src/test/unit/lang/parser/quadratic_optimizer_test.cpp

cheers,
Mitzi

@charlesm93 - there were a few remaining small errors -

  1. inconsistant name for semantic actions function validate_quadratic_optimizer_control_f

  2. test-model was missing arg #8

also tweaked the grammar rule so that it will backtrack if it hits functions named something like quadratic_optimizer_with_fries( etc.

committed and pushed my changes - hope this helps.
cheers,
Mitzi

1 Like

here is what I did to uncover these problems:

  1. started looking at the unit test - cross-referenced the line in the test-model file with the quadratic optimizer code and noticed that there was a missing argument.

  2. uncommented the grammar rule in src/stan/lang/grammars/term_grammar_def.hpp

at this point, the code wasn’t compiling. captured the c++ boost/spirit/qi compiler error novel to an output file. in the error novel, looked for the very first line of spew that referenced files in src/stan/lang/grammars which showed that there was a problem with the term grammar rule. careful cross-referencing of the semantic action and its corresponding declarations and definitions in the semantic_actions.hpp, semantic_actions_def.cpp files showed that the problem was with the function name.

cheers,
Mitzi

2 Likes

added tests for all 11 or so ways that one could mess up on the arguments to the quadratic_optimizer function call.

@Bob_Carpenter are there other tests that should be done w/r/t adding this to the language?

1 Like

Not that I’m aware of.

@mitzimorris I finally got a chance to check out your commits (have been a little mia this month). Thank you so much for resolving the issue.

Is there a chance to get the solve in the next Stan release?
I would be beneficial for function approximation as asked here:
https://discourse.mc-stan.org/t/curve-fitting-including-convexity-constraint/9077/15

1 Like

There are technical issues for the general implementation of a quadratic optimizer with linear constraints. Yes, we can solve the problem, and yes we can compute derivatives where they exist. But the resulting posterior is non-smooth at the boundaries, which is a problem for HMC. See this discussion here.

But then I would expect this is also true for the current algebra_solver. Depending on the
tolerance \epsilon > 0 this will create a non-smooth posterior. Is that true?

No. \epsilon is not a model parameter, you can only pass it as a double.

Maybe I misunderstood your query?

The algebraic solver requires full rank algebraic systems, otherwise the Jacobian is ill-posed and the function errs out. The quadratic solver is full rank in the interior but becomes rank deficient on the boundaries. In other words there are no inconsistencies.

Thank you. I was thinking about the solver as a numerical approximation. The solver may stop in any interval [-\epsilon, +\epsilon]. What happen to the derivatives? They may be far off.
This all would be no problem, if not my (or others) Stan programs gets stuck from time to time using
the algebra solver.

It’s true that the solution and furthermore all the derivatives we compute with autodiff are numerical approximations. What matters is that the error is not too large.

Ideally, you would set \epsilon to be small enough to get “sufficient” precision (i.e. your inference doesn’t get better if you make \epsilon smaller). The error in the solution propagates to the derivatives. So it’s not that the posterior is not smooth, rather that we’re doing a poor job evaluating the log density and the gradient.

my (or others) Stan programs gets stuck from time to time using the algebra solver.

Could you describe the exact symptoms? I’ve observed shortcomings with the solver and there may be tricks to get improved behavior. I’m also working on a better solver.

The Stan program with solver runs forever in the sampling period. I’m calculating the inverse of some log_sum_exp expected value problem in a planning system. It’s Markov chain systems (M/M systems).
The equations should be well behaved.

Currently I’m solving.

  vector algebra_system(vector y,  // unknowns
                        vector theta,  // parameters
                        real[] dat,
                        int[] dat_int) {
    vector[2] EV = E_value(y[1], y[2], theta[26:28], theta[1:25]);
    vector[2] f_x;
    f_x[1] = dat[1] - EV[1];
    f_x[2] = dat[2] - EV[2];
    return f_x;
  }

(I’d rather not like to share the function E_value, not because I’m not willing, but to avoid intellectual problems.)
dat is my target, but I could also minimize:

(EV[1] - dat[1])^2 + (EV[2] - dat[2])^2

I naively changed Stan’s algebra_solver.hpp to try use Eigen’s LevenbergMarquardt Algorithm:

line 156
Eigen::LevenbergMarquardt solver(fx);
line 167
solver.minimize(theta_dbl);

Since it also “solves” minimizes to f(x) = 0 by minimizing the squared error at 0.

See references here: https://stackoverflow.com/questions/18509228/how-to-use-the-eigen-unsupported-levenberg-marquardt-implementation

But it gives me some resize() exception. Although I admit I didn’t understand what’s going on in
the C++ file in all details.

You could try writing a vanilla Newton solver in Stan’s functions block and use this instead of the algebraic solver. If the problem is quasi-convex and doesn’t require too many iterations to solve, you might get some speed up. I have a branch with the Newton solver in math, but I haven’t exposed it to the language yet.

You should also make sure the calculations in E_value are efficient. Bottlenecks can arise with matrix calculations. Some general guidelines: avoid direct matrix inversion, use functions that take advantage of symmetry, etc. Of course, without seeing E_value it’s hard to prescribe anything specific.