Sum of cholesky factors vs sum of covariance

@RJTK, that is some sweet math, but as you point out there is no likelihood readily available.

@spinkney, I apologize I am being kind of vague with my questions. In general I am curious about expanding the simple model from this thread to more complicated cases, such as modeling difference in covariance related to a continuous variable (instead of a binary group variable as in this case).

The Stan user guide introduces the Cholesky factor parameterization as a speed/efficiency consideration since multi_normal and lkj_corr need to factor the variance-covariance matrix internally. There is also mention of numerical stability without much justification given. There is some discussion of this on this forum.

However, in my case it looks like I need to factor the variance-covariance matrix at least once per iteration anyway, plus calling tcrossprod. If I skip re-parameterization and just use the variance-covariance matrix directly I will end up factoring the matrix at least twice per iteration (once for the prior and once for the likelihood), but I will skip the calls to trcrossprod, avoid the generated quantities section, and make my model somewhat more readable.

I guess I’m kind of arguing with myself here, but wondering if there are any expert opinions about the merits of using the Cholesky factor parameterization for this sort of model where it is impossible to avoid factoring altogether.