Bounded and Constrained Optimization of non-linear objective in stan

Sorry for not getting to you earlier. I have no experience with using Stan for optimization, so I can’t comment on this part of the question. I am however a bit puzzled on why one would use sample to find the optimum instead of using the built in optimize functionality. Also note that the target in Stan is assumed to be the logarithm of the value to be optimized (for pure optimization of positive values this doesn’t change the meaning, but it is IMHO good to bear in mind).

I can nevertheless give a try on how to express the constraint.

Generally, adding to the target only if the constraint is met is IMHO unlikely to be sensible. Not adding anything means that for given i the log target is 0, so you might actually be putting some preference for values that break the constriant (depending on what the values of beta are).

I think that enforcing general inequalities in Stan is hard (see e.g. Multiple equality constraints on parameters). The problem is that it is hard to do without making the model not smooth (discontinuous derivatives). However, especially for pure optimization, it might be the case that some non-smoothness would be tolerated. If I am not mistaken, the constraints you have could be reformulated as:

functions {
  vector pmin(real a, vector b) {
    vector[num_elements(b)] res;
    for(i in 1:num_elements(b)) {
      res[i] = min({a, b[i]});
    }

    return res;
  }
}

data{
     int<lower=0> N; // number of ids
     vector[N] beta1; // coefficients for each id
     vector[N] beta2;
     vector[N] beta3;
}

parameters {
  vector<lower=0, upper=5>[N] x1;
  vector<lower=0, upper=5>[N] x2;
  vector<lower=0, upper=pmin(5, 10 - x1 - x2)>[N] x3;
}

model{
  for(i in 1:N){
    target += exp(3 + x1[i]*beta1[i] + x2[i]*beta2[i] + x3[i]*beta3[i]);
  }
}

This requires a bit more recent Stan that supports vector of upper bounds (so you either should use cmdstanr or install rstan via Repository for distributing (some) stan-dev R packages | r-packages ). I would guess this would work if the optimum isn’t particularly close the the sharp corners introduced by the parallel minima (pmin). But please check this for yourself.

Best of luck with your model!

1 Like