How to add inequality constraint on sum of parameters

nanw · July 28, 2017, 1:55am

Hi everyone,

My advisor Dr. Adam Fleischhacker and I are working on elaborating the example models (following a full Bayesian workflow) in Stan reference manual while learning Stan. We have a question when working on ARCH model. The example in the manual is an ARCH(1) model, we would like to work on a more general ARCH(K) model:

with the condition of ![image|115x64](upload://oRtuEXsLGy9NMnnmN6HMobAzkau.jpg for i>0.

We wonder how to add this inequality constraint on sum of alpha_i, can I do something like:

parameters{
vector<lower=0, upper=1>[K] alpha;
}
transformed parameters{
real<lower = 0, upper = 1> sum_alpha;
sum_alpha = sum(alpha);
}

Thank you!

bgoodri · July 28, 2017, 3:22am

In cases like these, it is best to downscale a simplex. So,

parameters {
  real<lower=0,upper=1> sum_alpha;
  simplex[K] alpha_unscaled; // sums to 1
  real<lower=0> alpha_0;
}
transformed parameters {
  vector[K] alpha = sum_alpha * alpha_unscaled; // sums to sum_alpha
}

nanw · July 28, 2017, 3:31am

This is brilliant! Thank you very much!

ffinger · August 24, 2018, 5:21pm

Hi,
I did exactly this in one of my models.
However, Stan is giving me a warning that I may have to add a Jacobian adjustment because of the change of variables. To me it seems that all we are doing is linear, so the Jacobian adjustment is not necessary. Do I miss out on something?
Thanks.

bgoodri · August 24, 2018, 5:24pm

If you are attempting to put a prior on alpha in the transformed parameter block from the example above, then it is non-linear. It is best to put the priors on the things in the parameters block unless you know how to do the Jacobian adjustment correctly.

ffinger · August 25, 2018, 7:16am

Thanks for your answer. I am using alpha as the lefthand side of a sampling statement in the model block, I guess that is what you mean by putting a prior?
What do you mean by “setting a prior in the parameter block”?
And, could you point me towards a reference for the correct derivation of Jacobians of such transforms? I think I do have the necessary math background but don’t know enough about bayesian inference with NUTS and why it needs Jacobian corrections.

bgoodri · August 25, 2018, 3:10pm

I mean putting a prior — in the model block — on the things declared in the parameters block, so on sum_alpha and alpha_unscaled. If you do go the Jacobian route, then watch

starting at about 7:30

ffinger · August 27, 2018, 6:58pm

Thanks for this.
I have derived the adjustment for the above example and would be glad if somebody could tell me if what I did is correct. I put all the steps in detail. The only thing I am unsure about is if I can just let the redundant component of the simplex aside.

Define \alpha as the K-simplex and \beta as its scale (a scalar). \gamma is given by

\gamma = \beta \alpha.

Thus, the transform \gamma = f(\alpha,\beta) can be defined by leaving the redundant \alpha_K aside:

\gamma_k = \beta \alpha_k \; \mathrm{for} \; k \in (1,K-1)\\ \gamma_K = \beta (1-\sum_{i=1}^{K-1} \alpha_i)

The adjustment is the determinant of the Jacobian of the inverse transform:

\mathrm{det}(J_{f^{-1}}) = (\mathrm{det}(J_f))^{-1}

J_f can be divided in a diagonal part for which

A = J_f(i,i) = \beta \; \mathrm{for} \; i \in (1,K-1),

and

B = J_f(i,K) = \alpha_i \; \mathrm{for} \; i \in (1,K-1)
C = J_f(K,i) = -\beta \; \mathrm{for} \; i \in (1,K-1)
D = J_f(K,K) = 1-\sum_{i=1}^{K-1}\alpha_i.

Thus the determinant is

\mathrm{det}(J_f) = (D-CA^{-1}B) \; \mathrm{det}(A) = \beta^{K-1}

and

\mathrm{det}(J_{f^{-1}}) = \beta^{1-K}.

We have to apply the log of the absolute value of this determinant (\mathrm{log}(|\beta^{1-K}|)) as the Jacobian correction in our model block:

target += (1-K)*log(beta).

EDIT: the correction should be target += (K-1)*log(beta), see below for details!

Bob_Carpenter · September 10, 2018, 8:05am

I’m lousy at linear algebra of this complexity, but I can say that it will depend on how the parameters are declared in Stan. Do you start with \alpha being a simplex? If so, its Jacobian is already included; otherwise, you need to include the simplex transform as you seem to be doing for \gamma_K.

ffinger · September 10, 2018, 10:59am

Yes, \alpha is defined as a simplex. So in this case I don’t have to apply any Jacobian correction, even if I multiply the components of \alpha by another parameter \beta ?

Bob_Carpenter · September 11, 2018, 10:00am

I have to count Jacobians on my fingers.

Overall, what needs to happen is the K free unconstrained parameters map to the scaled simplex \beta \alpha, where \alpha is a simplex and \beta is positive. I just meant that the Jacobian for the K - 1 unconstrained parameters to simplex \alpha and 1 unconstrained variable to positive \beta are already included.

But now that I write it out, I see you took this into account in the mapping from the K - 1 outputs of the simplex and \beta to \beta \alpha.

Now if I lay out another transform as

(\alpha_1, \ldots, \alpha_{K - 1}, \beta) \mapsto (\beta \alpha_1, \ldots, \beta \alpha_K)

then we need to add the log Jacobian determinant of this map to the implicit log Jacobian applied to the simplex and positive transforms. The Jacobian is lower triangular [edit: not, see below!] with the first K - 1 elements of the diagonal being for k \in 1:(K - 1),

J_{k, k} = \frac{\partial}{\partial \alpha_k} \beta \alpha_k = \beta

and the K-th element being

J_{K, K} = \frac{\partial}{\partial \beta} \, \beta \cdot \alpha_K \ = \ \frac{\partial}{\partial \beta} \, \beta \cdot \left( 1 - \sum_{k=1}^{K-1} \alpha_k \right) \ = \ \alpha_K

[edit: this is right as far as it goes, but it misses the non-zero elements in the last row and column.]

The log Jacobian determinant of a lower-triangular matrix is just the sum of the logs of the elements [edit: correct], so it looks like it should be [edit: but is not]:

\log \mathrm{det}(J) = (K - 1) \log \beta + \log \alpha_K

So it looks to me like the log Jacobian should is missing a \log \alpha_K term in the final log Jacobian determinant [edit: it’s not].

But as I said, I’m really bad at this. These problems are like homework to me and I keep hoping to get better at it. So please correct my homework if it’s wrong. [edit: Thanks!]

ffinger · September 11, 2018, 10:53am

That’s about where I am with those things, never used too much linalg since my undergrad…

Yes, this is the transform I was looking at. I think you are missing out some elements:

J_{K,k} = \frac{\partial}{\partial \beta} \beta \alpha_k = \alpha_k for k \in (1,K-1)
J_{k,K} = \frac{\partial}{\partial \alpha_k} \beta \alpha_K = \frac{\partial}{\partial \alpha_k} \beta (1-\sum_k \alpha_k) = - \beta for k \in (1,K-1)

Thus the matrix is not triangular but the determinant can be found by dividing it into blocks A to D (see above), where D is a 1x1 matrix:

Also, I thought that the Jacobian correction is the one of the inverse tranform, which in this case is \beta^\mathbf{1-K}:

Meaning that the correction is

Let me know if this doesn’t make sense, as I said I haven’t done much of this at all in the last 10 years or so…

Bob_Carpenter · September 11, 2018, 12:58pm

I wrote that part of the manual. And yes, it’s the inverse transform, which for our constrained parameters is the mapping from the unconstrained space to the constrained space. Or in this case, your change of variables,

(\alpha_1, \ldots, \alpha_{K - 1}, \beta) \mapsto (\beta \alpha_1, \ldots, \beta \alpha_K)

As you point out, the Jacobian’s not lower triangular. I should’ve drawn it all out rather than rushing, and you can see the structure:

J = \begin{array}{c|ccccc} & \alpha_1 & \alpha_2 & \cdots & \alpha_{K-1} & \beta \\ \hline \beta \cdot \alpha_1 & \beta & 0 & \cdots & 0 & \alpha_1 \\ \beta \cdot \alpha_2 & 0 & \beta & \cdots & 0 & \alpha_2 \\ \vdots & 0 & 0 & \ddots & 0 & \vdots \\ \beta \cdot \alpha_{K-1} & 0 & 0 & \cdots & \beta & \alpha_{K-1} \\ \beta \cdot \left(1 - \sum_{k=1}^{K-1} \alpha_k\right) & -\beta & -\beta & \cdots & -\beta & \left( 1 - \sum_{k=1}^{K-1} \right) \alpha_k \end{array}

Now I’m stuck on the determinant (I was a pure math major who did abstract linear algebra up to Galois theory—determinants were for engineers, or so my advsior said). If the determinant’s \beta^{K-1}, then yes, the appropriate Jacobian adjustment for the change of variables in Stan would be:

target += (K - 1) * log(beta);

Thanks! Our forums remain a great place to get my homework corrected :-)

ffinger · September 11, 2018, 1:06pm

I had to look it up too. :-)

Thanks to you, and thanks for pointing out that I was taking the inverse of the inverse :)

Cheers.

bgoodri · September 11, 2018, 3:25pm

Use Schur’s theorem

The top left corner of \mathbf{J} is a diagonal submatrix of size K - 1, so we know its determinant (\beta^{K - 1}) and inverse (diagonal with entries \frac{1}{\beta}). So, all we have to do is multiply by the determinant of the Schur complement, but that is just a 1\times1 matrix, which works out to 1 - \sum_{k=1}^{K-1}\alpha_k + \beta \mathbf{1}^\top \mathbf{I} \frac{1}{\beta} \left[\alpha_1, \alpha_2, \dots \alpha_{K-1}\right]^\top = 1.

Bob_Carpenter · September 14, 2018, 3:44pm

Thanks, Ben. I really need to learn matrices.

asifzubair · September 24, 2019, 9:14pm

Sorry, @bgoodri to bring this up after a long time.

Do you mean that if we put a prior - in the model block - on things declared in the transformed parameters block - so on alpha - we wouldn’t require the Jacobian adjustment ?

Sorry, the conversation is a bit hard to follow for me so I wanted to seek a clarification.

Thanks!

bgoodri · September 25, 2019, 1:16am

If you put priors on things declared in the parameters block, then you never need a Jacobian adjustment. If you put priors on things not declared in the parameters block, then you need a Jacobian adjustment unless the Jacobian matrix is a constant with respect to the things declared in the parameters block.

asifzubair · September 25, 2019, 1:28am

Ah, I see. Would you know of a worked-out example that I could lookup? Appreciate it. Thank you.

Topic		Replies	Views
Prior on sum of parameters - what about Jacobian? Modeling	9	4651	June 5, 2017
Non linear constraints Modeling	4	545	November 13, 2020
Constraining the sum of parameter vector Modeling techniques	1	525	May 30, 2020
How to impose constraint on a parameter transformed from other parameters Modeling specification	5	713	August 10, 2021
Constrain the parameters by their sum/product Modeling	7	1767	February 14, 2022

How to add inequality constraint on sum of parameters

Related topics