How to estimate large correlation matrix when cholesky factorization isn't working

In my field (neuroimaging) we often utilize “atlases” of brain regions (such as the 90 region AAL or 264 region Power atlas) to characterize resting state brain networks for use with graph theoretic analysis. One thing I was hoping to do is use Stan to estimate a network. Usually I’d use the ledoit-wolf/schaefer & strimmer shrinkage method which is very fast, but it just gives point estimates.

The stan code below (which I believe I found on this forum) works great if I’m working with just 5-10 variables. However, anything over that it starts to get slow. Over 25 variables and the sampler just doesn’t go, and variational bayes methods fail. This is the case even with larger values for the LKJ prior (2-10). This seems to be the case even when N > P, although it’s not uncommon to have to work with N-nearly-equal-to-P or P > N cases. How might I adjust the model below to work with larger models?

data {
  real lambda; // prior value of LKJ prior shrinkage parameter
  int<lower=1> N; // number of observations
  int<lower=1> P; // dimension of observations
  vector[P] y[N]; // observations
  vector[P] Zero; // a vector of Zeros (fixed means of observations)
}
parameters {
  cholesky_factor_corr[P] Lcorr;  
  vector<lower=0>[P] sigma; 
}
model {
  y ~ multi_normal_cholesky(Zero, diag_pre_multiply(sigma, Lcorr));
  sigma ~ cauchy(0, 1);
  Lcorr ~ lkj_corr_cholesky(lambda);
}
generated quantities {
  matrix[P,P] Omega;
  matrix[P,P] Sigma;
  Omega = multiply_lower_tri_self_transpose(Lcorr);
  Sigma = quad_form_diag(Omega, sigma); 
}

An example of the issue

stanmodel = stan_model(model_code=corr)
bfi = as.matrix(scale(na.omit(psych::bfi)[1:200,1:25]))
standata = list(N=200, P=25, y=bfi,  Zero = colMeans(bfi), lambda=2)
results = sampling(stanmodel, standata)

I ran this example and all of the transitions hit the maximum treedepth. This suggests the shape of the posterior distribution is such that it needs to take very small steps in some places and can’t make enough steps to get around the parameter space effectively.

For a model this simple, there is not a whole lot you can do with the parameters. I would look to see if there is a transformation to the data that would make it more amenable to a multinormal.