How to fit "standard" Co-variance structures

Hi All,

First off apologies this is a cross post from stack overflow as I have only just found this forum !!
EDIT: removed link to original post as apparently new users can only have 2 links per post >.<

I am currently using R-Stan to fit a multivariate normal distribution. The current model is

b ~ MVN ( 0 , Sigma )

where

b = ( x1 , x2 , x3 )
0 = ( 0 , 0 ,0 )

Sigma =
enter image description here

I am able to build the co-variance matrix using the following:

parameters {
    row_vector b[3];    
    real<lower=0> b_sigma[3];
    real<lower=-1, upper=1> b_rho[3];
}

transformed parameters {
    matrix[3,3] b_SIGMA;
    
    b_SIGMA[1,1] = b_sigma[1] ^ 2;
    b_SIGMA[2,2] = b_sigma[2] ^ 2;
    b_SIGMA[3,3] = b_sigma[3] ^ 2;

    b_SIGMA[1,2] = b_rho[1] * b_sigma[1] * b_sigma[2] ;
    b_SIGMA[2,1] = b_rho[1] * b_sigma[1] * b_sigma[2] ;

    b_SIGMA[3,1] = b_rho[2] * b_sigma[1] * b_sigma[3];
    b_SIGMA[1,3] = b_rho[2] * b_sigma[1] * b_sigma[3];

    b_SIGMA[2,3] = b_rho[3] * b_sigma[2] * b_sigma[3];
    b_SIGMA[3,2] = b_rho[3] * b_sigma[2] * b_sigma[3];
}

However this to me seems incredibly manual and inefficient. Is there a proper or recommended way for building such variance structures ?

On a highly related note PROC MIXED in SAS (see Table 56.15) offers a wide array of “out the box” variance structures such as unstructured, compound symmetry, autoregressive , etc. Is there an equivalent in STAN or will I need to manually construct them each time ?

NOTE: As this question was more theoretical I assumed that data + a fully working example was not beneficial. I am happy to provide data + fully working example though if people would like to play around with it or deem otherwise.

In the parameters block, you can declare

parameters {
  row_vector[3] b; // this was wrong in the OP
  cov_matrix[3] b_SIGMA:
}

and delete all of transformed parameters.

However, that is often unnecessarily slow, numerically slow, and difficult to specify a prior on. The way many Stan users do it is like

parameters {
  row_vector[3] b;
  cholesky_factor_corr[3] b_L_corr;
  vector<lower=0>[3] sigma;
}
model {
  matrix[3,3] b_L = diag_pre_multiply(sigma, b_L_corr);
  // call target += multi_normal_cholesky_lpdf(b | rep_row_vector(0, 3), b_L) somewhere
  target += lkj_corr_cholesky(1); // or maybe increase the shape hyperparameter
}

However, that usually results in poor sampling. It is often but not always better to use the stochastic representation of the multivariate normal (or really any elliptical) distribution. To do that, you do

parameters {
  vector[3] b_raw;
  cholesky_factor_corr[3] b_L_corr;
  vector<lower=0>[3] sigma;  
}
transformed parameters {
  row_vector[3] b = transpose( diag_pre_multiply(sigma, b_L_corr) * b_raw );
}
model {
  target += normal_lpdf(b_raw | 0, 1); 
  // implies b ~ multi_normal_cholesky()
  target += lkj_cholesky_corr(1);
  // some prior on sigma
  // likelihood
}

However, the way it is done in the rstanarm R package is to declare proportions of variances (as a simplex) and a scale parameter in the parameters block, rather than standard deviations. In that case, you have

parameters {
  row_vector[3] b;
  cholesky_factor_corr[3] b_L_corr;
  simplex[3] pi; // proportions of variance
  real<lower=0> gamma; // scale parameter
} 
transformed parameters {
  vector[3] sigma = sqrt(pi) * sqrt(3) * gamma;
}
model {
  target += normal_lpdf(b_raw | 0, 1); 
  // implies b ~ multi_normal_cholesky()
  target += lkj_cholesky_corr(1);
  // some prior on gamma
  // you can do a Dirichlet prior on pi but uniform over simplexes is usually fine
  // likelihood
}
2 Likes

Those you have to construct.

Hi @bgoodri,

Thank you for the detailed reply (you were also the first person to help me with another Stan question on stack overflow 8 months ago so again thank you for your continued support!!) .

I haven’t used or studied Cholesky factors before but quickly skimming wiki and Stan reference manual it seems reasonably initiative. I will try and adapt my code to use the methods you have recommended !

Is there any to run or test Stans inbuilt functions? For example it would be nice to test some of these matrix operation functions you’ve mentioned on dummy values just to make sure I am fully understanding what they are doing.

In R, see help(expose_stan_functions, package = "rstan").