Quadratic optimizier

charlesm93 · June 3, 2018, 2:53am

Hi,

Shosh and I are working on the development of a quadratic optimizer, with constraints (it’s a Karush–Kuhn–Tucker problem; it generalizes the Lagrangian optimization problem by allowing some of the constraints to be inequalities). I’m attaching the spec for a prototype function, in case anyone has feedback.

I would also like to hear from users who might be interested in such a function.
quadratic_program_gradient_doc.pdf (183.3 KB)

Bob_Carpenter · June 8, 2018, 12:01am

That sounds great.

Is this something you’re thinking about adding to Stan like the algebraic solver? If so, it’ll need an issue with a spec of the function being built (which can point to external refereces).

The name in the language will have to be something like quad_solver or quadratic_solver as it looks like we’re going down the route of having multiple solvers.

When we get lambda abstractions with closures, it’ll get much easier to drop in lightweight functions, so I wouldn’t worry too much about that.

P.S. I’d write that signature matrix H(real[] theta, real[] delta) if those are two real arrays. In the manual, I’m using (real[], real[]):matrix as the notation for a function type that takes two real array arguments and returns a matrix.

charlesm93 · June 8, 2018, 12:32am

I’ll edit the document to have the appropriate signatures. And yes, theta and delta need only be arrays, though we may want to have them as vectors. I’ll tinker on that. I was thinking of calling the function quadratic_optimizer – it’s a slightly different problem than solving equations.

The gitHub issue is here https://github.com/stan-dev/math/issues/888.

charlesm93 · June 8, 2018, 12:39am

There is one technical difficulty. Computing the sensitivities (the Jacobian matrix of the solution w.r.t to parameters) requires we know which elements in the solution vector are zero and which ones are non-zero.

Unfortunately, a numerical optimizer may return something like 1e-16 instead of 0. I’m not sure then how to flag an element as zero or non-zero, other than by introducing a tuning parameter, say \alpha, and a condition if (x[i] < alpha).

avehtari · June 8, 2018, 7:11pm

Sounds like you need sparse vector type.

charlesm93 · June 9, 2018, 6:17pm

Have we used sparse vector types in Stan?

charlesm93 · June 9, 2018, 6:20pm

I finished writing the “evaluation” function and wrote some basic unit tests. Somehow, the inequality constraint doesn’t seem to be (always) kicking in. @shoshievass, maybe you could take a look and see if you can reproduce the error in R.

The branch is feature/issue-888-quadratic_optimizer, and the files of interest are:

stan/math/rev/mat/functor/quadratic_optimizer.hpp
stan/math/rev/mat/functor/eiquadprog.hpp
test/unit/math/rev/mat/functor/quadratic_optimizer_test.hpp

The error is exposed in the last lines of the test file:

  // test the inequality constraint.
  theta << 0, 1;
  x = quadratic_optimizer(fh(), fv(), fa_0(), fb(), theta, delta, n);
  std::cout << x << "\n \n";

All the elements of x should be greater than 0, but x(1) = -1. I think the code is readable (but it’s all C++) but I can write out the math if needed.

shoshievass · June 9, 2018, 9:29pm

I think it’s a numerical error issue.

The result I got from R is this: -3.143751e-16 2.000000e+00 0.000000e+00

Code below:

library(quadprogpp)

G = matrix( c( 2.1, 0.0, 1.0,
             1.5, 2.2, 0.0,
             1.2, 1.3, 3.1), 3, 3, byrow=T)

g0 = c(6, 1, 1)

CE = matrix(c(1, 2, -1), 3, 1)

ce0 = c(-4)

CI = matrix( c(1, 0, 0, -1,
       0, 1, 0, -1,
       0, 0, 1,  0), 3, 4, byrow=T)

ci0 = matrix( c(0, 0, 0, 10), 4, 1)

x = QP.Solve(G, g0, CI, ci0, CE, ce0)

charlesm93 · June 9, 2018, 10:07pm

I actually had another example in mind:

library(quadprogpp)

n = 2
theta = c(0, 1)

G = diag(n)
g0 = c(0, theta[2])

CI = diag(n)  # identity matrix
ci0 = c(0, 0)

# dummy - not needed (setting trivial conditions)
CE = matrix(rep(0, n*n), n, n)
ce0 = c(0, 0)

x = QP.Solve(G, g0, CI, ci0, CE, ce0)

The optimizer returns x = (0, -1) (in both R and C++) but it seems to me the conditions set by CI and ci0 should force x to only have positive elements. The problem is indeed:

\mathrm{min} \frac{1}{2} x_1^2 + \frac{1}{2} x_2^2 + x_2

subject to x_1 \ge 0 and x_2 \ge 0; unless I messed something up?

shoshievass · June 9, 2018, 10:18pm

Oh I see. Maybe the optimizer fails when you have dummy constraints like this? I think it may cause an error because it creates an underdetermined matrix (sorry if I’m getting the linear algebra term wrong - a matrix with null space from lower than the number of rows), which might break the linear algebra in the quadprog algorithm. I had some similar errors when trying something similar earlier. Maybe we can check that the CE and CI matrices have full dims?

betanalpha · June 12, 2018, 9:46am

That is likely to induce a discontinuity in the overall function, which is not unsurprising given the nature of these optimization problems that try to parameters towards boundaries (which is fundamentally a discrete problem). This will cause issues going forwards because functions with any nontrivial discontinuities will be problematic for Hamiltonian Monte Carlo.

charlesm93 · June 12, 2018, 12:14pm

Yeah, this crossed my mind. My next move is to build a unit test which showcases this issue.

charlesm93 · June 12, 2018, 4:19pm

Here’s a simple example to illustrate the differentiation problem:

\underset{x_1, x_2}{\mathrm{min}}\frac{1}{2} x_1^2 + \frac{1}{2} x_2^2 + \theta x_2 subject to \bf{x} \ge 0 and x_1 = \phi.

x_1 is solved by the linear constraint.
Then x_2 = - \theta if \theta \ge 0, else x_2 = 0.

The limit for the partial derivative \partial x_2 / \partial \theta does not exist at \theta = 0.

More generally, any phase transition (point where x_i flips from being zero to non-zero) will cause an issue. This can be seen clearly by looking at the semi-analytical expression @shoshievass worked out. The problem seems mathematically ill-posed, but still, haven’t we already dealt with a comparable issue?

I know something similar came up in a PK/PD model with bolus dosing and lag times, but I don’t think we worked out a fix.

bgoodri · June 12, 2018, 4:43pm

I don’t think it is mathematically ill-posed. It is a well-posed optimization problem that can have a boundary solution for some value of \theta. That can be difficult to deal with from a numerical perspective and even more so when embedded within NUTS.

shoshievass · June 13, 2018, 2:37pm

Do you think there might be a path forward?

shoshievass · June 13, 2018, 3:53pm

I think the issue is that QP.solve(.) doesn’t accept zero matrices for the equality constraints (this is something that can be solved w/ linear algebra in the C++ code - the error handling is pretty bad here).

However, you can exclude the equality constraints:

x = QP.Solve(G, g0, CI, ci0)

produces (0,0).

bgoodri · June 13, 2018, 9:11pm

If you find what you believe to be a solution on the boundary, can you automatically re-run it with that parameter fixed to the boundary value?

shoshievass · June 14, 2018, 10:49am

I’m not sure I understand the idea. Could you please elaborate on what exactly we’d re-run and why that would help w/ the Jacobian wrt model parameters that cause the solution to flip from hitting the boundary?

Also, I can’t think of a reparametrization that would maintain the optimality of the quadratic program in an unconstrained formulation of the program but maybe there is one? @betanalpha have you seen anything along these lines?

charlesm93 · June 14, 2018, 12:08pm

Maybe we can do a transformation and solve the optimization problem in an unconstrained space. Say, have the solution be x = e^\lambda and solve for \lambda, with no inequality constraints.

shoshievass · June 14, 2018, 12:29pm

The issue is - I don’t see how that kind of transformation would preserve optimality on the unconstrained space…

What if we did something MH-y and rejected draws that led to negative numbers? From experience playing around w/ the unconstrained problem, it might be hard to draw unrejected params, but maybe not impossible?

Topic		Replies	Views
Feature request: Jacobians of unconstraining transform Modeling	41	3556	December 25, 2016
Exposing quadratic optimizer Developers	23	1513	June 18, 2019
L-BFGS-B comment Algorithms optimization	23	4528	October 13, 2016
Inequality constraints on linear combinations of parameters Modeling specification , priors	21	316	March 23, 2025
Hiring for consulting project to write quadratic optimizer exposed to stan Jobs stan-math	7	907	February 13, 2020

Quadratic optimizier

Related topics