Question on cholesky_factor_corr and the resulting correlation matrix

Hello!

A basic question from a stan novice…

In my model, I use cholesky decomposition to estimate the covariance matrix. I also recorded the generated correlation and covariance matrices in the block “generated quantities”. Furthermore, I reparameterzied the half cauchy distribution for the standard deviation sigma.

I am able to obtain the lower triangular matrix L_corrmat and the correlation matrix Thetasmat:

> Lcorr_summary <- summary(fit.stan, pars = "L_corr")
> L_corrmat <- matrix(Lcorr_summary$summary[,1], nrow=7, byrow=TRUE)
> round(L_corrmat,3)
       [,1]   [,2]   [,3]   [,4]   [,5]  [,6]  [,7]
[1,]  1.000  0.000  0.000  0.000  0.000 0.000 0.000
[2,] -0.074  0.944  0.000  0.000  0.000 0.000 0.000
[3,]  0.044 -0.434  0.806  0.000  0.000 0.000 0.000
[4,] -0.062 -0.097 -0.117  0.822  0.000 0.000 0.000
[5,]  0.008  0.091 -0.221  0.062  0.764 0.000 0.000
[6,]  0.010 -0.100 -0.184 -0.010 -0.030 0.681 0.000
[7,] -0.085  0.111 -0.288 -0.044  0.074 0.042 0.599

> Thetas_summary <- summary(fit.stan, pars = "Thetas")
> Thetasmat <- matrix(Thetas_summary$summary[,1], nrow=7, byrow=TRUE)
> round(Thetasmat,3)
       [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]
[1,]  1.000 -0.074  0.044 -0.062  0.008  0.010 -0.085
[2,] -0.074  1.000 -0.459 -0.095  0.095 -0.107  0.124
[3,]  0.044 -0.459  1.000 -0.071 -0.256 -0.129 -0.336
[4,] -0.062 -0.095 -0.071  1.000  0.082  0.020 -0.014
[5,]  0.008  0.095 -0.256  0.082  1.000 -0.005  0.157
[6,]  0.010 -0.107 -0.129  0.020 -0.005  1.000  0.089
[7,] -0.085  0.124 -0.336 -0.014  0.157  0.089  1.000

However, when I perform the cholesky decomposition on Thetasmat in R, the resulting matrix is different from stan output L_corrmat:

> t(round(chol(Thetasmat),3))
       [,1]   [,2]   [,3]   [,4]   [,5]  [,6]  [,7]
[1,]  1.000  0.000  0.000  0.000  0.000 0.000 0.000
[2,] -0.074  0.997  0.000  0.000  0.000 0.000 0.000
[3,]  0.044 -0.457  0.888  0.000  0.000 0.000 0.000
[4,] -0.062 -0.099 -0.128  0.985  0.000 0.000 0.000
[5,]  0.008  0.096 -0.240  0.062  0.964 0.000 0.000
[6,]  0.010 -0.106 -0.200 -0.016 -0.044 0.973 0.000
[7,] -0.085  0.118 -0.313 -0.048  0.077 0.043 0.933

I also cannot obtain the correlation matrix Thetasmat using \Theta = LL^T:

> round(L_corrmat %*% t(L_corrmat), 3)
       [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]
[1,]  1.000 -0.074  0.044 -0.062  0.008  0.010 -0.085
[2,] -0.074  0.897 -0.413 -0.087  0.086 -0.095  0.111
[3,]  0.044 -0.413  0.839 -0.055 -0.217 -0.104 -0.284
[4,] -0.062 -0.087 -0.055  0.702  0.067  0.022 -0.008
[5,]  0.008  0.086 -0.217  0.067  0.645  0.008  0.127
[6,]  0.010 -0.095 -0.104  0.022  0.008  0.508  0.068
[7,] -0.085  0.111 -0.284 -0.008  0.127  0.068  0.470

Somehow I just can’t figure out why and am wondering what I am missing here. Any feedback is appreciated!

My simplified code is as follows:

data {
  ...
  int<lower=1> R; // no. of respondents
  int<lower=1> S[R]; // observation per respondent 
  int<lower=3> C; // no. of choices in each scenario
  int<lower=1> N; // total observations
  int Y[N]; // observed choices 
  ...
}

parameters {
  ...
  vector<lower=0, upper=pi()/2>[J-1] sigmas_unif;
  cholesky_factor_corr[J-1] L_corr;
  matrix[R,J-1] alphas_raw;
...
}

transformed parameters {
  ...
  vector<lower=0>[J-1] sigmas=2.5*tan(sigmas_unif);  //sigmas ~ cauchy(0, 2.5);
  matrix[R, J-1] alphas;
  for (r in 1:R){
    alphas[r] = to_row_vector(mus + diag_pre_multiply(sigmas, L_corr)*to_vector(alphas_raw[r]));
  }
}

model {
  int yr;
  int xr;
  yr = 1;
  xr = 1;
  
  // priors
  ...
  mus ~ normal(0, 1000);
  L_corr ~ lkj_corr_cholesky(2);
...
  
  // likelihood
  for (r in 1:R) { // for each individual
    int ypos;
    int xpos;
    ypos = yr;
    xpos = xr;
    
    to_vector(alphas_raw[r]) ~ std_normal();	
   
    for (s in 1:S[r]) { 
    Y[ypos] ~ categorical_logit(block(Xs, xpos, 1, C, 7)*to_vector(alphas[r])
              +block(X2, xpos, 1, C, W)*omega - 1/delta*segment(P, xpos, C)); 
    ypos = ypos + 1;
    xpos = xpos + C;
    }
    yr = yr + S[r];
    xr = xr + S[r]*C;
  }
}

generated quantities {
  matrix[J-1, J-1] Thetas = multiply_lower_tri_self_transpose(L_corr);
  matrix[J-1, J-1] Sigmas = quad_form_diag(Thetas, sigmas); 
}

The Cholesky factor of the average correlation matrix is not equal to the average Cholesky factor. Other than that, I don’t think you are doing anything wrong in your code.

Thank you for your response!

Just to be sure - does this mean that the only way to get the correlation matrix estimates is using multiply_lower_tri_self_transpose(L) in stan?

If I do not include multiply_lower_tri_self_transpose(L) in my code, are there other ways to recover the correlation matrix estimates using the average Cholesky factor?

You could do multiply_lower_tri_self_transpose(L) in the generated quantities block of the Stan program or you could do something like this in R

L <- extract(fit, params = "L_corr")[[1]] # 4000 x 7 x 7
Theta <- apply(L, MARGIN = 1, FUN = tcrossprod)

The elementwise average of Theta will match what summary shows.

1 Like

I understand it now. Thank you!