In my field (neuroimaging) we often utilize “atlases” of brain regions (such as the 90 region AAL or 264 region Power atlas) to characterize resting state brain networks for use with graph theoretic analysis. One thing I was hoping to do is use Stan to estimate a network. Usually I’d use the ledoit-wolf/schaefer & strimmer shrinkage method which is very fast, but it just gives point estimates.
The stan code below (which I believe I found on this forum) works great if I’m working with just 5-10 variables. However, anything over that it starts to get slow. Over 25 variables and the sampler just doesn’t go, and variational bayes methods fail. This is the case even with larger values for the LKJ prior (2-10). This seems to be the case even when N > P, although it’s not uncommon to have to work with N-nearly-equal-to-P or P > N cases. How might I adjust the model below to work with larger models?
data {
real lambda; // prior value of LKJ prior shrinkage parameter
int<lower=1> N; // number of observations
int<lower=1> P; // dimension of observations
vector[P] y[N]; // observations
vector[P] Zero; // a vector of Zeros (fixed means of observations)
}
parameters {
cholesky_factor_corr[P] Lcorr;
vector<lower=0>[P] sigma;
}
model {
y ~ multi_normal_cholesky(Zero, diag_pre_multiply(sigma, Lcorr));
sigma ~ cauchy(0, 1);
Lcorr ~ lkj_corr_cholesky(lambda);
}
generated quantities {
matrix[P,P] Omega;
matrix[P,P] Sigma;
Omega = multiply_lower_tri_self_transpose(Lcorr);
Sigma = quad_form_diag(Omega, sigma);
}
An example of the issue
stanmodel = stan_model(model_code=corr)
bfi = as.matrix(scale(na.omit(psych::bfi)[1:200,1:25]))
standata = list(N=200, P=25, y=bfi, Zero = colMeans(bfi), lambda=2)
results = sampling(stanmodel, standata)