# Initialization Issues with Cholesky Factorization of Covariance Matrix

Hey y’all! I am trying to run a hierarchical Bernoulli logit model with a multivariate normal prior on the coefficients with priors on the mean and covariance of the multivariate normal. I am running into initialization issues (“Initialization failed” in pystan) when using the Cholesky factorization of the covariance matrix (yet I was not running into these issues when using the inverse wishart prior, but that has been extremely slow so far). Additionally, I am not entirely sure if the model is specified correctly as I am fairly new to stan. Any help would be greatly, greatly appreciated as I am working on my masters thesis and these computational issues are really getting in the way and stressing me out!! Cheers

``````data {
int<lower=1> D; //Number of individual predictors
int<lower=0> N; //Number of data points
int<lower=1> P; //Number of levels
int<lower=0, upper =1> y[N]; //Classification
int<lower=0, upper = P> ll[N]; //levels of each data point
row_vector[D] x[N]; //Matrix of individual predictors
matrix[D,D] identity; //Identity matrix used as a prior
vector[D] zeros; // Vector of zeros used as a prior
}
parameters {
vector[D] mu; //Mean vector of multivariate normal
vector[D] beta[P]; //Coeficients for each of the various levels
cholesky_factor_corr[D] L_Omega; //Cholesky correlation
vector<lower=0, upper=pi()/2>[D] tau_unif; //cholesky scale

}
transformed parameters {
vector<lower=0>[D] tau; //cholesky scale
matrix[D,D] Sigma_beta; //Covariance matrix defined from cholesky
for (d in 1:D) {
tau[d] = 2.5 * tan(tau_unif[d]);
}
Sigma_beta = diag_pre_multiply(tau, L_Omega); //Calculation of covariance from cholesky

}

model {
mu ~ multi_normal(zeros, identity); //Prior on multivariate normal
L_Omega ~ lkj_corr_cholesky(3);
tau_unif ~ normal(0,1);

for (p in 1:P){
beta[p] ~ multi_normal(mu, Sigma_beta);
}

{
vector[N] x_beta_ll;
for (n in 1:N) {
x_beta_ll[n] = x[n] * beta[ll[n]];
}
y ~ bernoulli_logit(x_beta_ll);
}
}

Thank you so much in advance!``````

I think there are two main issues here. The first is that you don’t need to construct the standard deviations on the cholesky scale, you can just estimate them directly and use them to construct the covariance matrix.

The second is that you’re passing a cholesky factor to a distribution that expects a full covariance matrix (since `diag_pre_multiply(tau, L_Omega)` returns a cholesky factor). If you change the prior for `beta[p]` to `multi_normal_cholesky`, that should help.

There are also some other optimisations that you can look at here, which will speed things up a bit for you.

First, the multivariate prior for `mu` is actually fairly inefficient, and would perform better with a univariate prior, thanks to Stan’s vectorisation, this can be equivalently specified as:

``````mu ~ std_normal();
``````

Similarly, the loop isn’t needed in the prior for `beta` and can be specified as:

``````beta ~ multi_normal_cholesky(mu, Sigma_beta);
``````

If you find issues (performance or quality) in the sampling for `beta`, you could also try a non-centered reparameterisation (see this section of the manual for more info)

Hope some of that helps!

Cheers,
Andrew