# How to fit "standard" Co-variance structures

Hi All,

First off apologies this is a cross post from stack overflow as I have only just found this forum !!
EDIT: removed link to original post as apparently new users can only have 2 links per post >.<

I am currently using R-Stan to fit a multivariate normal distribution. The current model is

b ~ MVN ( 0 , Sigma )

where

b = ( x1 , x2 , x3 )
0 = ( 0 , 0 ,0 )

I am able to build the co-variance matrix using the following:

``````parameters {
row_vector b;
real<lower=0> b_sigma;
real<lower=-1, upper=1> b_rho;
}

transformed parameters {
matrix[3,3] b_SIGMA;

b_SIGMA[1,1] = b_sigma ^ 2;
b_SIGMA[2,2] = b_sigma ^ 2;
b_SIGMA[3,3] = b_sigma ^ 2;

b_SIGMA[1,2] = b_rho * b_sigma * b_sigma ;
b_SIGMA[2,1] = b_rho * b_sigma * b_sigma ;

b_SIGMA[3,1] = b_rho * b_sigma * b_sigma;
b_SIGMA[1,3] = b_rho * b_sigma * b_sigma;

b_SIGMA[2,3] = b_rho * b_sigma * b_sigma;
b_SIGMA[3,2] = b_rho * b_sigma * b_sigma;
}
``````

However this to me seems incredibly manual and inefficient. Is there a proper or recommended way for building such variance structures ?

On a highly related note PROC MIXED in SAS (see Table 56.15) offers a wide array of “out the box” variance structures such as unstructured, compound symmetry, autoregressive , etc. Is there an equivalent in STAN or will I need to manually construct them each time ?

NOTE: As this question was more theoretical I assumed that data + a fully working example was not beneficial. I am happy to provide data + fully working example though if people would like to play around with it or deem otherwise.

In the `parameters` block, you can declare

``````parameters {
row_vector b; // this was wrong in the OP
cov_matrix b_SIGMA:
}
``````

and delete all of `transformed parameters`.

However, that is often unnecessarily slow, numerically slow, and difficult to specify a prior on. The way many Stan users do it is like

``````parameters {
row_vector b;
cholesky_factor_corr b_L_corr;
vector<lower=0> sigma;
}
model {
matrix[3,3] b_L = diag_pre_multiply(sigma, b_L_corr);
// call target += multi_normal_cholesky_lpdf(b | rep_row_vector(0, 3), b_L) somewhere
target += lkj_corr_cholesky(1); // or maybe increase the shape hyperparameter
}
``````

However, that usually results in poor sampling. It is often but not always better to use the stochastic representation of the multivariate normal (or really any elliptical) distribution. To do that, you do

``````parameters {
vector b_raw;
cholesky_factor_corr b_L_corr;
vector<lower=0> sigma;
}
transformed parameters {
row_vector b = transpose( diag_pre_multiply(sigma, b_L_corr) * b_raw );
}
model {
target += normal_lpdf(b_raw | 0, 1);
// implies b ~ multi_normal_cholesky()
target += lkj_cholesky_corr(1);
// some prior on sigma
// likelihood
}
``````

However, the way it is done in the rstanarm R package is to declare proportions of variances (as a simplex) and a scale parameter in the `parameters` block, rather than standard deviations. In that case, you have

``````parameters {
row_vector b;
cholesky_factor_corr b_L_corr;
simplex pi; // proportions of variance
real<lower=0> gamma; // scale parameter
}
transformed parameters {
vector sigma = sqrt(pi) * sqrt(3) * gamma;
}
model {
target += normal_lpdf(b_raw | 0, 1);
// implies b ~ multi_normal_cholesky()
target += lkj_cholesky_corr(1);
// some prior on gamma
// you can do a Dirichlet prior on pi but uniform over simplexes is usually fine
// likelihood
}
``````
2 Likes

Those you have to construct.

Hi @bgoodri,

Thank you for the detailed reply (you were also the first person to help me with another Stan question on stack overflow 8 months ago so again thank you for your continued support!!) .

I haven’t used or studied Cholesky factors before but quickly skimming wiki and Stan reference manual it seems reasonably initiative. I will try and adapt my code to use the methods you have recommended !

Is there any to run or test Stans inbuilt functions? For example it would be nice to test some of these matrix operation functions you’ve mentioned on dummy values just to make sure I am fully understanding what they are doing.

In R, see `help(expose_stan_functions, package = "rstan")`.