because lkj_corr_cholesky(1); on any of the constrained cholesky_factor_corr matrices returns -nan, I think because some of the entries in the constrained matrix end up being 0? Anyone seen anything like this? cc @bgoodri as the resident Cholesky Factor, maybe I’m doing something really dumb here.
The error messages:
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan can't start sampling from this initial value.
Initialization between (-2, 2) failed after 100 attempts.
For Cholesky factors that big, it is possible that the diagonal elements result in the square root of a slightly negative number when symbolically they should be slightly positive. If you search for the “onion method” on the Stan-users Google group, there is some code for a more numerically stable way to construct a Cholesky factor of a large correlation matrix.
The onion method is more promising since it is derived from unit vectors to the left of the diagonal on the Cholesky factor of a KxK correlation matrix, but there is no easy way in the Stan language to get as many unit vectors as there are countries for each value of k between 2 and K. Link to post.
Is this issue still an issue in current stan?
Also, I got a bunch of warnings when I tested the stan code about not filling in the L matrix. Not sure if that matters. I instantiated L as a zero matrix and I removed that nested loop. I didn’t see any performance improvements but I also didn’t extensively test.
data {
int<lower = 0> K;
real<lower = 0> eta;
}
transformed data {
real<lower = 0> alpha = eta + (K - 2) / 2.0;
vector<lower = 0>[K-1] shape1;
vector<lower = 0>[K-1] shape2;
shape1[1] = alpha;
shape2[1] = alpha;
for(k in 2:(K-1)) {
alpha -= 0.5;
shape1[k] = k / 2.0;
shape2[k] = alpha;
}
}
parameters {
row_vector[choose(K, 2) - 1] l; // do NOT init with 0 for all elements
vector<lower = 0,upper = 1>[K-1] R2; // first element is not really a R^2 but is on (0,1)
}
model {
// priors that imply Sigma ~ lkj_corr(eta)
l ~ std_normal();
R2 ~ beta(shape1, shape2);
}
generated quantities {
matrix[K,K] Sigma;
matrix[K,K] L = rep_matrix(0, K, K);
{
int start = 1;
int end = 2;
L[1,1] = 1.0;
L[2,1] = 2.0 * R2[1] - 1.0;
L[2,2] = sqrt(1.0 - square(L[2,1]));
for(k in 2:(K-1)) {
int kp1 = k + 1;
row_vector[k] l_row = segment(l, start, k);
real scale = sqrt(R2[k] / dot_self(l_row));
L[kp1, 1:k] = l_row[1:k] * scale;
L[kp1, kp1] = sqrt(1.0 - R2[k]);
start = end + 1;
end = start + k - 1;
}
Sigma = multiply_lower_tri_self_transpose(L);
}
}
Yes, that’s correct. The code runs but gives warnings. Would this be a candidate for pedantic mode to tell users that this warning can be ignored if data are in the cells that the code or functions reference? Such as, "matrix is partially empty. This is fine if the code or functions only reference non-empty cells such as when referencing the lower triangular part of a matrix in multiply_lower_tri_self_transpose". @rok_cesnovar
Is there some situation when there is a danger of this happening? Shouldn’t Stan’s default initialization provide random non-zero values pretty much always?