Basically title. When should I prefer multi_normal_cholesky
over multi_normal
? Is the lpdf
calculation significantly more performant under some conditions? Or is there some advantage to autodiff? The documentation simply states it exists in the STAN math library without any elaboration.
Under all conditions. By starting with a pre-factored covariance matrix, the evaluation is quadratic in dimension rather than cubic. It saves on both the determinant calculation (because Cholesky is triangular, it’s only diagonal) and the quadratic form that involves the inverse of the covariance matrix. The autodiff is similarly sped up because it follows the basic evaluation.
There’s even more advantage within Stan because the way we parameterize a dense covariance matrix is using a Cholesky factor under the hood (N choose 2 unconstrained elements below the diagonal, and N diagonal elements which must be positive, so they’re log transformed). So it saves a lot of work in just creating a well-formed covariance matrix.
We should probably hint that it’s both more numerically stable and more efficient. We go into that fairly early in the User’s Guide in the regression chapter.
OK, cool, thanks. Perhaps at least adding some link from the function documentation to another part would be nice. It’s not obvious to look this up in the regression section, one may easily use normal distribution in contexts that wouldn’t be considered a regression model.
Definitely. Here’s the issue in our documentation repository: