Covariance matrix with constraints

Hello,
I am trying estimate the covariance matrix parameters of a multivariate normal distribution \boldsymbol{X} \sim \mathcal{N}(\boldsymbol{\mu},\boldsymbol{\Sigma)}. In this case, \boldsymbol{X} is 3-dimensional. Assume \boldsymbol{\mu} is known.

I have a priori theoretical reasons for wanting to incorporate the following constraints on \boldsymbol{\Sigma}.

  • \boldsymbol{\Sigma}_{1,1}=\boldsymbol{\Sigma}_{2,2}=\boldsymbol{\Sigma}_{3,3} (=a)

  • \boldsymbol{\Sigma}_{1,2}=\boldsymbol{\Sigma}_{2,1}=\boldsymbol{\Sigma}_{2,3}=\boldsymbol{\Sigma}_{3,2} (=b)

  • \boldsymbol{\Sigma}_{1,3}=\boldsymbol{\Sigma}_{3,1} (=c)

Thus, I would like to estimate the three parameters (a,b,c) that determine \boldsymbol{\Sigma}.

I have read, on this forum and in Stan’s documentation, the methods that Stan has for ensuring covariance matrices are positive semi-definite. However, I do not know how to incorporate constraints on \boldsymbol{\Sigma} in Stan.

I am new to estimating covariance matrices, so any guidance would be much appreciated.

Thank you!

Hi, the constraint on the diagonal entries is straightforward because you can decompose \Sigma into SRS, where S is a diagonal matrix of standard deviations and R is the correlation matrix. So you would have the one parameter a that fills in all the diagonal entries of S.

Imposing the off-diagonal constraints on R is trickier. I like the soft constraint approach described in this blog post of @Stephen_Martin . There are some other approaches based on the Cholesky factor, an overview is in my paper here.

I haven’t worked out the jacobian so I don’t suggest putting priors on the cholesky factor here but this will generate a cholesky factor corr with the given restriction

data {
  int<lower=0> N;
}
parameters {
  vector[3] y;
}
transformed parameters {
  matrix[N, N] L = diag_matrix(rep_vector(1, N));

  real s = norm2(y);
  L[3, : 3] = y' / s;
  L[2, 1] = L[3, 2] / sqrt(L[3, 2]^2 + L[3, 1]^2 - 2 * L[3, 1] + 1);
  L[2, 2] = sqrt(1 - L[2, 1]^2);
 
  matrix[N, N] Sigma = multiply_lower_tri_self_transpose(L);
}
model {
  y ~ std_normal();
}
generated quantities {
  matrix[N, N] L_out = cholesky_decompose(Sigma);
}

You can form your variance values in a separate vector and multiply to form the full covariance matrix.

Example output in R

> mod_out$summary("Sigma")
# A tibble: 9 × 10
  variable       mean  median    sd   mad     q5   q95  rhat ess_bulk ess_tail
  <chr>         <dbl>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
1 Sigma[1,1] 1        1       0     0      1     1     NA         NA       NA 
2 Sigma[2,1] 0.0178   0.0216  0.570 0.717 -0.896 0.907  1.00    3664.    3015.
3 Sigma[3,1] 0.000146 0.00484 0.576 0.734 -0.899 0.899  1.00    4054.    3159.
4 Sigma[1,2] 0.0178   0.0216  0.570 0.717 -0.896 0.907  1.00    3664.    3015.
5 Sigma[2,2] 1        1       0     0      1     1     NA         NA       NA 
6 Sigma[3,2] 0.0178   0.0216  0.570 0.717 -0.896 0.907  1.00    3664.    3015.
7 Sigma[1,3] 0.000146 0.00484 0.576 0.734 -0.899 0.899  1.00    4054.    3159.
8 Sigma[2,3] 0.0178   0.0216  0.570 0.717 -0.896 0.907  1.00    3664.    3015.
9 Sigma[3,3] 1        1       0     0      1     1     NA         NA       NA 
1 Like