I am working with a multi normal distribution, my model of interest is a true multi-normal distribution but I am starting with something simple and build from it. The model is the following:

y \sim \mathcal{N}(0,\Sigma)

where the covariance matrix \Sigma is just a diagonal matrix

This can work without the multi-normal of course, but I am trying to get it to work with this to learn from it and slowly add more complexity. If I introduce the lkj_corr prior I start getting errors. Below is the stan code:

data {
int<lower=0> N;
vector[N] y;
}
parameters {
real<lower=0> sigmaD;
corr_matrix[N] Omega;
}
transformed parameters {
vector<lower=0>[N] sigma;
for (i in 1:N) {
sigma[i] = sigmaD^2;
}
}
model {
sigmaD ~ gamma(11,0.3);
Omega ~ lkj_corr(3);
y ~ multi_normal(rep_vector(0,N), quad_form_diag(Omega, sigma));
}

But I get errors such as these:

â€śRejecting initial value:
Error evaluating the log probability at the initial value.
Exception: lkj_corr_lpdf: Correlation matrix is not positive definite.
Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
Initialization failed.â€ť

I donâ€™t quite understand why this is happening. Can someone explain? Any suggestions?

@billWalker all correct! Agreed. one should better work on the cholesky form.

The problem arise with increasing likelihood if the dimension of the correlation matrix
increases. In the code above the dimension D = N, assigning each datapoint itâ€™s own
variance which is cannot be estimated.

Even if D < N and D is large (>30) the sampler may run into problems. I figured two
ways around it:

One may set in the sampler statement init = 0, which is a quick hack, but not
recommended. Another is to do your own initialization. See Stan manuals for details.
Set the initialization of the correlation matrix or better the cholesky form to
to Identity matrix same dimension, eg.

Thank you very much Bill, however, I donâ€™t think the cholesky will save this, it seems from @andre.pfeufferâ€™s response that it has more to do with the size of the correlation matrix.

Thank you very much!! I see that if the matrix is too big this multivariate normal or cholesky will have problems, I might be able to find a way around this for my problem though!.

That looks nice. Whatâ€™s the intuition of the R2 parameter? Is that controlling the average level of correlation or partial correlations? I also donâ€™t have a good sense of what the create_shapes function is doing.

For the OP, Iâ€™ve run into problems with the LKJ prior over the years. My problems were associated with problems where the variables are multivariate normal but with high correlations (like in a factor model). So the correlation parameters tend to be correlated in weird non-linear ways that Stan has difficulty with.

I am possibly missing something crucial , but I think you are vastly overcomplicating the problem. If you really know your correlation matrix to be diagonal, with the same variance along that diagonal, itâ€™s just the product of identical univariate Gaussians from which youâ€™re just trying to estimate the variance. Isnâ€™t this just a textbook problem?

You donâ€™t need to work with the Cholesky factorisation: you can do it by hand, and itâ€™s just the matrix with sigma along the diagonal. But indeed you can write the whole problem as a simple sampler for sigma.

My â€śreal problemâ€ť is much more complicated, and the only reason that I am doing it like this is just so that I can get it to work in a simple case and then build from it. What I am aiming for is not a diagonal matrix, but rather a hidden markov model where there a multi-normal distribution with covariation between consecutive observations, but we have to start somewhere!!

Thank you again very much for your wanting to help!