Additional constraint on unit simplex

jmh530 · January 23, 2018, 7:10pm

How would I go about putting additional constraints on a unit simplex? For instance, if a simplex is w>0, sum(w)=1, then I want Aw=b instead of sum(w)=1.

Below I have my original, simple model. It’s just doing a regression with a constant vector of ones on the left with simplex coefficients.

I want to additionally constrain the the coefficients to such that w’*mu=mu_target. The second model is my failed attempt. I say it fails because while the equality constraints hold, the inequality constraint does not (specifically w_1 can be any real value, not just between 0 and 1).

I spent some time looking at Section 35.6 on Unit Simplexes in the documentation, but I’m not sure what I would need to do precisely to convert that into Stan code (maybe then it would be clearer how to adjust that to take into account the extra constraint).

Also, if there’s some other way to do accomplish this, I’m all ears.

//original model
data {
     int<lower=0> N;
     int<lower=0> T;
     matrix[T, N] X;
}
transformed data {
     vector[T] ones_T;

     ones_T = rep_vector(1, T);
}
parameters {
    simplex[N] w;
    real<lower=0> sigma;
}
model {
    ones_T ~ normal(X * w, sigma);
}

//failed attempt
data {
     int<lower=0> N;
     int<lower=0> T;

     matrix[T, N] X;

     vector[N] mu;
     real<lower=0> mu_target;
}
transformed data {
     vector[T] ones_T;
     real mu_1;
     vector[N - 1] mu_minus_1;
     
     ones_T = rep_vector(1, T);
     mu_1 = mu[1];
     mu_minus_1 = mu[2:N];
}
parameters {
    simplex[N - 1] w_minus_1;
    real<lower=0> sigma;
}
transformed parameters {
    real dot_mu_w_minus_i;
    real w_1;
    vector[N] w;
    
    dot_mu_w_minus_i = dot_product(mu_minus_1, w_minus_1);

    w_1 = (mu_target - dot_mu_w_minus_i) / (mu_1 - dot_mu_w_minus_i);
    w[1] = w_1;
    w[2:N] = w_minus_1 * (1 - w_1);
}
model {
    ones_T ~ normal(X * w, sigma);
}

jmh530 · January 23, 2018, 8:46pm

Hmm, it seems I add an additional sampling statement on w’*mu and it produces some sensible results.

data {
     int<lower=0> N;
     int<lower=0> T;

     matrix[T, N] X;
     vector[N] mu;
     real mu_target;
}
transformed data {
     vector[T] ones_T;

     ones_T = rep_vector(1, T);
}
parameters {
    simplex[N] w;
    real<lower=0> sigma;
}
model {
    dot_product(mu, w) ~ normal(mu_target, 0.001);
    ones_T ~ normal(X * w, sigma);
}

jjramsey · January 23, 2018, 9:25pm

I’m surprised that Stan isn’t giving you a warning that a Jacobian adjustment may be necessary (see section 22.3 of the Stan language manual) – or is it?

jmh530 · January 23, 2018, 9:38pm

I had ignored it…

d/dw (w’mu) = mu, so you would be incrementing by a constant.

betanalpha · January 24, 2018, 4:22am

Given that you are still using the simplex type I presume that you mean you want w > 1, sum(w) = 1, and A * w = b. The problem with this is that the additional constraint implies that the solution space for w is on a lower dimensional space. Just how low dimensional depends on the specific structure of A and b. You can work out analytic results in some cases, but you have to be careful about boundary conditions.

The way this will manifest in your model is that the w will be non identified in some directions. Because the w are constrained by the simplex type this won’t lead to the sampler trying to drift off to infinity, but it will lead to narrow posteriors, small step sizes, and probably low E-BFMI warnings.

jmh530 · January 24, 2018, 4:51pm

Michael I appreciate the reply.

On your first point, I was thinking in terms of optimization. Aw=b is a common way to write constraints in quadratic optimization and sum(w)=1 could be included as one such constraint. Strictly speaking, I had only wanted one additional constraint (at the moment), so it would be w >1, sum(w) =1, and w’*mu=mu_target.

On your second point, my way to avoid having to identify to derive the solution (from my second post) was to add an additional modelling statement that dot_product(mu, w) ~ normal(mu_target, 0.001);. I was been playing around with this a little since I had posted it and occasionally I will get a warning about a divergence or two, but I’ve been happy with the number of effective samples. The constraint doesn’t always hold though, it doesn’t force w’*mu=mu_target. This hasn’t been a significant enough issue for me to get concerned yet.

Also, I was able to make the modification below where zeroes_T is a vector of T 0s and X_minus_mu is a centered version of X. The effect of this is like I am choosing w that minimizes the variance of X*w and also tries to set w’*mu=mu_target. Pretty cool, IMO. The results look good so far.

model {
    dot_product(mu, w) ~ normal(mu_target, 0.001);
    zeros_T ~ normal(X_minus_mu * w, sigma);
}

jmh530 · January 24, 2018, 5:29pm

I had to replace
dot_product(mu, w) ~ normal(mu_target, 0.001);
with
target += T * normal_lpdf(dot_product(mu, w) | mu_target, 0.001);
because when $T$ goes to infinity the dot_product gets overwhelmed by the other modelling statement.

Now the constraint holds to within 0.001, though with fewer effective samples (though still reasonable).

Bob_Carpenter · February 7, 2018, 9:17pm

The alternative is to lower the scale—it has the same effect.

Topic		Replies	Views
How to add constraints to Stan model Modeling	6	632	February 10, 2024
How to achieve a weighted simplex (simplex generalization)? Modeling	11	913	August 6, 2021
Following up on several discussions of simplex adjustments and ragged arrays of simplexes Modeling techniques , specification , user-defined-functions , constraint-transform	13	169	December 10, 2024
Combining inequality, order, and no constraints on a Simplex Modeling specification	1	479	November 14, 2018
Constrain the parameters by their sum/product Modeling	7	1837	February 14, 2022

Additional constraint on unit simplex

Related topics