Lower bound in a simplex element

panacea · September 19, 2019, 10:31pm

Hi all, I’ve got a newbie question: I’d like to specify a constraint on a specific element from a simplex vector a[K]. This is a[1] >= 0.05, for instance.

But I’m not sure if I can make it according to https://mc-stan.org/docs/2_18/stan-users-guide/vectors-with-varying-bounds.html

What would be the best way to approach it?

bgoodri · September 19, 2019, 11:48pm

You would have to do the stick-breaking transformation yourself after putting a lower bound of 0.05 on the first element.

nhuurre · September 20, 2019, 2:47pm

Custom stick-breaking is not necessary because linear mixture of simplexes is still a simplex and every component of a simplex is positive.

parameters {
  simplex[K] xs;
}
transformed parameters {
  simplex[K] x;
  x[1] = 0.05 + 0.95*xs[1];
  x[2:] = 0.95*xs[2:];
}

panacea · September 22, 2019, 12:03pm

Thank you both! I tested both approaches and they work, complying with my purpose -to have an element greater than or equal to 0.05 in the simplex-.

However, this has lead to my second question. The aim of such constraint is because I’m fitting a mixture model of bivariate Gaussians and I want to force one of such components to be responsible of (at least) 5% of the data points. This may seem strange, but it is my intention to separate the tail of the green component -shown in the Figure- into two (green and yellow) during the fitting process instead of performing it as a post-processing step. The red data points belong to a third component. In the figure, the ellipses correspond to the geometrical representation of the Gaussian distributions with a 95% confidence.

That is why I wanted to impose a constraint on the mixing proportions to force the yellow component to be on top of some datapoints -5% at least-. The thing is that the resulting mixing proportion for the yellow component is close to the lower bound (0.05), but it can be seen that the distribution is not on top of any data point (left image). The desired output would be somethin similar to the right image.

Does it make sense to impose a constraint over the mixing proportions in a mixture model in this way?

emiruz · September 24, 2019, 8:39am

If your purpose is scientific and your 5% constraint reflects prior knowledge about the problem then the right way to do this may be to reflect that knowledge in your priors. Rather than using the simplex it may make sense for you to implement the stick breaking process yourself as @bgoodri suggested above (it’s pretty easy to do). Then unlike with the simplex you’ll be able to decide what your prior is for each mixing component separately, and if you like you can make one of the priors strongly informative and focused around the 5% level.

If you share your code here, we can probably help refactor it.

spinkney · September 24, 2019, 10:12am

I think you can achieve something similarly with @nhuurre’s suggestion too by adding a parameter in place of the 0.05 and putting a strong prior on that parameter around 0.05.

parameters {
  simplex[K] xs;
  real<lower=0, upper=1> alpha;
}
transformed parameters {
  simplex[K] x;
  x[1] = alpha + (1 - alpha) * xs[1];
  x[2:] = (1 - alpha) * xs[2:];
}
model {
alpha ~ beta(1, 19);
...
}

martinmodrak · September 24, 2019, 10:30am

Unfortunately, I don’t believe constraining the proportion of data points is a good way to achieve the goal you have - as witnessed by the fact that the component is pulled towards the lower bound anyway. If you know a priori that you want to separately fit the tail of the green component, I would advise to include that structure in your model. One simple way would be to make the clustering hierarchical: first you find two clusters (red and green) and then you split the green cluster into two clusters. This might be a bit tricky to code but should IMHO better reflect what you are trying to do.

Good luck with the model!

panacea · October 1, 2019, 6:47am

Thank you all for your suggestions :-) All of them have contributed to being able to address the problem. Finally, I have managed it using the ‘generated quantities’ block. The way I see it, I think it makes sense.

stemangiola · March 23, 2020, 11:46pm

On a transformed parameter as such

transformed parameters {
  simplex[K] x;
  x= 0.05 + 0.95*xs
}

Can I use x on the left side of a probability statement (since is a linear transformation)?

x ~ dirichlet(softmax(a + b*x) * phi);

The reason that I can explain better is that I have this into a hierarchical model where dirichlet prior with components < 1 tend to push proportions to 0 and if there is not enough evidence the model never converges and proportion and prior push each-other lower and lower. If I have sensible boundary on simplex this may not happen.

nhuurre · March 24, 2020, 7:48am

Yes, it’s linear, the Jacobian is, uhh, either (K-1)*log(0.95) or K*log(0.95) but in any case constant so it can be omitted. (I assume x on the right side is an unrelated predictor and not the same simplex.)

stemangiola · March 25, 2020, 11:45am

Correct that was confusing

Topic		Replies	Views
Ordered simplex constraint transform Modeling	20	1807	May 11, 2023
Additional constraint on unit simplex Modeling	7	2177	February 7, 2018
How to achieve a weighted simplex (simplex generalization)? Modeling	11	908	August 6, 2021
Following up on several discussions of simplex adjustments and ragged arrays of simplexes Modeling techniques , specification , user-defined-functions , constraint-transform	13	167	December 10, 2024
Combining inequality, order, and no constraints on a Simplex Modeling specification	1	479	November 14, 2018

Lower bound in a simplex element

Related topics