I’m getting cholesky decompose errors when sampling a large covariance matrix (~100 < rows < 1000) which is assembled with a kronecker product-like for loop.
The loop is similar to the gaussian process covariance matrix example on the Stan reference guide (without using cov_exp_quad), but loops over independent variable x[N]
multiple times (M
, which can vary from 2 to dozens) generating a covariance matrix of size MN \times MN. For M=2, for instance I have:
K = \begin{bmatrix} K_{11} & K_{21} \\ K_{12} & K_{22} \end{bmatrix}
The Stan model is compiling alright, but I am getting RuntimeError: Exception: cholesky_decompose: Matrix m is not positive definite
. That is somewhat expected because a large matrix of this kind is very constrained, and the K_ij submatrices have positive diagonals and K may end up with an odd structure, I guess. But I am able to choose some parameters such that K is positive semidefinite, but from there it gets a little weirder, here are some observations:
[1]: I am testing the positive semidefiniteness for parameter values in Python before sampling (with a Python versionof the loop and numpy.linalg.cholesky
) and I get no errors when computing L = cholesky(K)
.
[2]: if I pass this fixed K_{test} matrix as a Stan data dictionary and have Stan compute cholesky_decompose
I get no errors (or sampling, of course) – if in parallel I also compute the real K matrix in Stan and print out the first terms they are the same as the python computed matrix in the data dict, printing out the differences in all terms doesn’t seem to have any differences greater than 1e-16, so I’m thinking there are no errors in building the matrix in Stan.
[3]: I can choose the initial parameters in a way that the assembled matrix is the identity matrix, for which I should have no problem, but I still get the same error.
[4]: Depending on the size of the matrix (depending on M), if I run an almost infinite loop catching the runtime error an trying again, I may be able to sample, but it’s kind of unpredictable and for large M I didn’t manage to sample after a few hours of looping (a few hundred to couple of thousand tries).
Finally I thought I was going crazy, went back and rewrote all Stan code from scratch based on the last similar example I had working and I found out the problems started when I moved these calculations from the model
to transformed parameters
block. Using algorithm=Fixed param
I can run a single iteration with PyStan’s stan_model.sampling() a hundred times getting no errors for the former, but the latter gives me an error several times before I can sample a single time.
Is there anything I’m missing here that may be causing this behavior?