# Constrain the parameters by their sum/product

Hello,

I am looking for general information on how to constrain parameters in Stan based on their sum/product. I have two situations where I expect that I will need to use this, so I will try to post two questions below.

1.) Say that you have a vector of parameters declared by vector[N] a. How can I code it so that the sum of all elements of a is 1?

2.) Say that you have two vectors of elements: vector[N] b and vector[N] c. How can I code it so that the product of b[1] and c[1] is always 1, product of b[2] and c[2] is always 1, product of b[3] and c[3] is always 1 and so on?

For 1.) we have the simplex parameter type:

parameters {
simplex[N] a;
}


For 2), given that the result of the product is always known, the value of one element can always be determined by the other:

b_1 * c_1 = 1 \\ c_1 = \frac{1}{b_1}

So you just need to declare one vector as a parameter, and the other as a transformed parameter:

parameters {
vector[N] b;
}
transformed parameters {
vector[N] c = inv(b);
}

3 Likes

Another option for 2, if the parameters are constrained to be positive, is to work on the log scale with

parameters{
vector[N] log_b;
}
transformed parameters{
vector[N] b = exp(log_b);
vector[N] c = exp(-log_b);
}


I think this is likely to be more numerically stable than working on the identity scale, but that’s just a hunch.

Note that if the parameters are not constrained to be positive, then you’ll have problems crossing zero because c will approach infinity as b approaches zero.

Edit: note also that to work on the log scale, if you place priors directly on the exponentiated terms (b and c) then you will need to add Jacobian adjustment terms to get the priors that you intend.

5 Likes

Great suggestions by @andrjohns and @jsocolar. Just in case it’s relevant, if you want to constrain the product of more than two positive parameters \beta to be equal to some positive value \theta, you can use a simplex parameter \kappa of length equal to the number of parameters values in \beta, and define \beta_i = \theta^{\kappa_i}. Because \Pi_i \beta_i = \Pi_i \theta^{\kappa_i} = \theta^{\Sigma_i \kappa_i} and \Sigma_i \kappa_i = 1 (because it’s a simplex), your product of parameter values in \beta are guaranteed to be \theta.

Note that this will only work for target values of the product other than 1.0.

6 Likes

That is an awesome trick @LucC!

Just to supplement with a solution for vectors whose product is 1.0, the Stan Users Guide has a section (1.7) on parameterizing centered vectors (i.e. vectors that sum to zero) including some useful (but limited) discussion of how to obtain the desired priors on their elements. See here: 1.7 Parameterizing centered vectors | Stan User’s Guide. Exponentiating a centered vector will always yield a vector whose product is 1 (and multiplying by the Nth root of P will yield a vector whose product is P).

2 Likes

Warning – the simplex satisfies a summation constraint and a positivity constraint. It cannot in general be used to model a set of real variables that sum to one but each can take on positive and negative values. For that you’d need to define fix one of the variables to the negative sum of the others.

The same issue will affect the exponential of simplex variables. Taking \beta_{i} = \theta^{\kappa_{i}} constrains each \beta_{i} to the interval [1, \theta].

Thank you very much for the heads up! However, I find myself unable to write the requited code for all of this to work. I would be grateful if someone could help me out. Let’s say that we have two vectors of length 10, whose elements can be both positive and negative:

parameters {
vector[10] a;
vector[10] b;
}


How would the Stan code look like if I want to constrain 1) The sum of all values of the vector a to be 1; 2) The product of all values of the vector b to be 1 and 3) to constrain the product of the n-th element of each vector to be 1 (e.g. a[1] * b[1] = 1, a[2] * b[2] = 1, a[3] * b[3] = 1 etc.)? Many thanks.

Do you want to enforce 1, 2, and 3 simultaneously, or are they three separate questions? To enforce simultaneously, notice that you have 8 degrees of freedom (10 + 10 nominal parameters and 12 equations). Once you know all the elements of a, you get all the elements of b trivially. So you can attempt to derive expressions for a[9] and a[10] in terms of a[1:8] subject to the constraints (1) and (2).

However, actually sampling from this thing will be a nightmare, I suspect. To see why, consider what happens if we drop constraints (1) and (3) and just try to enforce (2). Given that b can contain both positive and negative values (and assuming you don’t know a priori which values are positive and which are negative), consider what has to happen for any element to cross zero. The product of the remaining elements has to jump from positive infinity to negative infinity (or vice versa). This doesn’t mean that the posterior geometry is necessarily hopeless, because the element that jumps from one “end” of the real line to the other doesn’t have to be a parameter; it can be defined as one divided by the product of the remaining elements.
However, there is a big risk of overflow here. More importantly, the prior pushforward on the final element of b will be ridiculous. If you want a reasonable prior on all elements of b (say no element is allowed to be larger than R, where R can be arbitrarily large), then the prior cannot contain the immediate vicinity of zero for any element, and this will be impossible to sample. If you really, really need to sample this thing, then presumably you could initialize in all 2^8 possible combinations of negative and positive over the 8 parameters, and then do some kind of stacking. In general, however, I think that this direction of thinking mostly goes to show that it should be relatively rare that you would a priori know the product of two quantities without a priori knowing those quantities signs.

1 Like