I am relatively new to Stan and have some difficulties dealing with the following situation. I have a block covariance matrix where each block has a different size (sizes are known). I want to formulate the model in terms of Cholesky factors of the correlation matrix. In principle, I could declare the corresponding Cholesky factor and scale vector for each block manually in the parameters block, but there are in the order of 25 blocks and so this would be quite tedious. Furthermore, if new data become available in the future the number of blocks and their size will likely change and so I would prefer not to have to re-declare all variables again. If all blocks were the same size, then I could simply declare an array of matrices, but that’s not the case.
Initially I thought of declaring a vector in the parameters block containing all the non-zero elements of the covariance matrix (for all blocks) and then place LKJ priors on the sub-matrices in the model block by means of a loop, something like the following:
parameters{
vector[npar] q; // npar is the total number of non-zero elements in the full covariance matrix
}
model {
for(i in 1:nblocks){
to_matrix(q[ks[i]:ke[i]],n[i],n[i]) ~ lkj_corr_cholesky(1);
// in the line above, 'ks' and 'ke' are vectors containing the first and last
// indices of the elements in 'q' corresponding to each block in the
// covariance matrix, whereas 'n' contains the number of rows (and columns) of each block. 'nblocks' is the number of blocks (~25).
}
}
Then, I would reassemble the full covariance matrix in the transformed parameters block. The code compiles ok but does not seem to work. So my question is, is there an elegant way to address this problem that does not involve declaring each sub-covariance matrix manually? Many thanks for any tips.
The index vectors ks and ke are set correctly. The code compiles ok, but upon running it throws an error stating that the “random variable is not lower triangular”. And indeed, if I print “to_matrix(q[ks[1]:ke[1]],n[1],n[1])” in the transformed parameters block, the output is a full matrix, which I fail to understand since I placed a lkj_corr_cholesky(1) prior on this matrix. Any hints on what I am doing wrong?
Ahh ok. I missed it the first time, but I think I know what’s going on. That’s helpful.
Yeah so the lkj_corr_cholesky(1.0) distribution is for matrices of the type cholesky_factor_corr[K] L;. This type represents correlation matrices that are Cholesky factored in to lower-triangular form.
What you have is just an ordinary matrix. So either you want to just use the regular lkj_corr() distribution on your matrices since they’re not lower-triangular, OR you should factorize your matrices using the Cholesky algorithm to make them lower-triangular. So you your choices are:
model {
for(i in 1:nblocks){
to_matrix(q[ks[i]:ke[i]],n[i],n[i]) ~ lkj_corr(1);
}
}
OR
model {
for(i in 1:nblocks){
cholesky_factor_corr[n[i]] L = cholesky_decompose(to_matrix(q[ks[i]:ke[i]],n[i],n[i]) );
L ~ lkj_corr_cholesky(1);
}
}
According the Stan manual, the second is preferred, but I’m not sure why. I’d imagine numerical issues of keeping the matrix symmetric. @bgoodri would know more.
Edit: Oh I forgot to mention, and maybe you already did this. But to_matrix() fills by columns first. So make sure your input in the q vector is correctly ordered for that and make sure it will lead to symmetric matrices. You can check this with a simple print("Matrix Values: ", to_matrix(q[ks[i]:ke[i]],n[i],n[i])) statement.