"Initialization failed" error messages

I realize that this issue might have its share of technical challenges, but it has been my experience that a lot of my more time-consuming Stan problems are met with the vague “initialization failed” error message. Tracking the source of the problem through the computational graph might be undesirable due to the computational burden, so I can understand that some issues might have to be bundled into the “initialization failed” category, but it seems that the current error also applies to things that are not really initialization problems. As an example, I recently had a transformed parameter matrix as part of a larger data-loading setup where I accidentally forgot to fill in a number of rows which resulted in “initialization failed” despite being an independent issue.

I am honestly not sure what, if anything, should or could be done about the issue, but thought it was worth bringing up for discussion.

What interface and version are you using? That might have been due to a bug masking messages that would have helped pinpoint the error.

I am using the developer branch of Pystan version 2.17.1.0 along with whatever version of Stan that comes with it.
There’s a bit more error output in the terminal, but it is usually just explaining why the initialization failed: “Gradient evaluated at the initial value is not finite.”, “probability is log(0)”, etc. without any further context, making it hard to pinpoint where the problem arises.

Another recent example, although this was actually an initialization problem. The following MWE

stancode = '''
data {
    int D;
}
parameters {
    cholesky_factor_corr[D] corr_chol;
}
model {
    corr_chol ~ lkj_corr_cholesky(1);
} 
'''
model = pystan.StanModel(model_code=stancode)
model.optimizing({'D': 100})

generally fails to initialize despite not involving any likelihood. Issue seems to be that the transformation from unconstrained space to constrained space scales in D somehow, leading to issues when D is high (for low D this works fine).

Here the error message might be justifiable, but given that the reported issue was Log probability evaluates to log(0), i.e. negative infinity. it took me a while to realize that the issue was not with my likelihood, my data, or the syntax of my code, and not even with the choice of prior, but rather the automatic initialization used.

Initializing to 0 makes it sample (even with D = 1000). This might indicate that there are numerical issues with the constraint, but that’s just a guess.

Can you verify that initializing it to 0 works?

I ended up initializing it to the identity matrix in my original code and that worked fine. The [-2,2] initialization apparently just does not work very well with this particular constraint transform (high gradient?).

My point was not so much that it could not be solved, just that a lot of issues are conflated under the “initialization failed” issue. This is a user experience issue, but I realize that there might not be a viable technical solution for it, just wanted to bring it up for discussion since it is my impression that it is common.

This stuff would be great to improve, it would require changes at the math library level that would need to be coordinated with how Stan catches exceptions. It’s not an easy area to change given the coordination required and how performance sensitive the code that needs better error reporting is.