What’s happening in the denominator here:
cholesky_cov_mu_b = cholesky_decompose(cov_mat_mu_b)/sqrt(rep_row_vector(1.0, S) * cov_mat_mu_b * rep_vector(1.0, S));
I would expect if it was the cholesky of the cov it would just be cholesky(cov)
, not the division.
I guess we would expect this model to go fast if the data wasn’t there. I wonder if your data is very informative or something?
Like centered is better than non-centered when you have informative data (8-schools example here: QR decomposition of parameters for multilevel regression? - #5 by bbbales2)
ALso hellooooo long time no talk. Stop by the slack and say hi some time: Mc-stan community slack