How to estimate large correlation matrix when cholesky factorization isn't working

BKV · October 30, 2018, 9:46pm

In my field (neuroimaging) we often utilize “atlases” of brain regions (such as the 90 region AAL or 264 region Power atlas) to characterize resting state brain networks for use with graph theoretic analysis. One thing I was hoping to do is use Stan to estimate a network. Usually I’d use the ledoit-wolf/schaefer & strimmer shrinkage method which is very fast, but it just gives point estimates.

The stan code below (which I believe I found on this forum) works great if I’m working with just 5-10 variables. However, anything over that it starts to get slow. Over 25 variables and the sampler just doesn’t go, and variational bayes methods fail. This is the case even with larger values for the LKJ prior (2-10). This seems to be the case even when N > P, although it’s not uncommon to have to work with N-nearly-equal-to-P or P > N cases. How might I adjust the model below to work with larger models?

data {
  real lambda; // prior value of LKJ prior shrinkage parameter
  int<lower=1> N; // number of observations
  int<lower=1> P; // dimension of observations
  vector[P] y[N]; // observations
  vector[P] Zero; // a vector of Zeros (fixed means of observations)
}
parameters {
  cholesky_factor_corr[P] Lcorr;  
  vector<lower=0>[P] sigma; 
}
model {
  y ~ multi_normal_cholesky(Zero, diag_pre_multiply(sigma, Lcorr));
  sigma ~ cauchy(0, 1);
  Lcorr ~ lkj_corr_cholesky(lambda);
}
generated quantities {
  matrix[P,P] Omega;
  matrix[P,P] Sigma;
  Omega = multiply_lower_tri_self_transpose(Lcorr);
  Sigma = quad_form_diag(Omega, sigma); 
}

An example of the issue

stanmodel = stan_model(model_code=corr)
bfi = as.matrix(scale(na.omit(psych::bfi)[1:200,1:25]))
standata = list(N=200, P=25, y=bfi,  Zero = colMeans(bfi), lambda=2)
results = sampling(stanmodel, standata)

bgoodri · October 31, 2018, 3:17pm

I ran this example and all of the transitions hit the maximum treedepth. This suggests the shape of the posterior distribution is such that it needs to take very small steps in some places and can’t make enough steps to get around the parameter space effectively.

For a model this simple, there is not a whole lot you can do with the parameters. I would look to see if there is a transformation to the data that would make it more amenable to a multinormal.

Topic		Replies	Views
Underestimating correlation coefficients with LKJ prior Modeling	24	7181	September 15, 2020
Learning a correlation matrix Modeling	8	3456	December 13, 2017
Generalizing lkj_corr_cholesky via Reduced Parameters Onion Modeling	13	545	May 14, 2026
Lkj_corr_lpdf: Correlation matrix is not positive definite General	9	1283	July 12, 2023
Sparse sampling for a Cholesky correlation matrix (or modelling correlations among some, but not all parameters) Modeling	15	1462	February 1, 2019

How to estimate large correlation matrix when cholesky factorization isn't working

Related topics