# Multi_normal_cholesky dramatically slower when cholesky factor scaled with variances

I’m working with a model that has multiple pairs of correlated observations, and so am estimating multiple `multi_normal_cholesky` distributions. I’m also estimating the variances for these observations, and so am using the `diag_pre_multiply(variance, cholesky)` approach.

When using just the cholesky factor (and not scaling with the variance), the chains mix well enough and runs in a reasonable time (~30secs).

However, when using the scaled cholesky factors, the model runs 10x slower (~350secs) and the chains are almost completely stationary.

I’ll paste the full model below, as well as attach some data that reproduces the issue. If there was some insight as to where this was coming from, it would be greatly appreciated.

data.R (13.1 KB) NonCentRes.stan (2.8 KB)

The relevant part is this section:

``````  for(c in 1:(T*(J-1))) {
matrix[2,2] y_cov[J*(T-1)];
y_cov[c] = diag_pre_multiply(resvar[{c,J+c}],y_Lcorr[c]);
y[,{c,J+c}] ~ multi_normal_cholesky(eta_vec[,{c,J+c}], y_cov[c]);
}
``````

The model works without any issues if I use `y_Lcorr[c]` instead of `y_cov[c]` in the `multi_normal_cholesky` call

The full model is:

``````data{
int N;              //Number of individuals
int J;              //Number of items
int T;              //Number of timepoints
vector[J*T] y[N];
}

parameters{
row_vector[J*T] intercepts;

/* 'Raw' Variables for Non-Centered Parameterisation */
matrix[N,T] eta_r;

/* Correlations between factors over time,
item residual variances, and residual correlations */
cholesky_factor_corr[T] e_Lcorr;
vector<lower=0>[J*T] resvar;
cholesky_factor_corr[2] y_Lcorr[J*(T-1)];
}

transformed parameters{

for(j in 1:J)
}

model{
/* These are temporary variables, used for holding
functions of other parameters. More efficient on
memory to include here as their posteriors aren't saved */
matrix[J*T, T] lam;
matrix[N,T] eta_t;
matrix[N,J*T] lam_eta;
row_vector[J*T] eta_vec[N];

/* Priors for factor and residual correlations */
e_Lcorr ~ lkj_corr_cholesky(1);

for(c in 1:(T*(J-1)))
y_Lcorr[c] ~ lkj_corr_cholesky(1);

intercepts ~ normal(0,5);

for(t in 2:(T-1))
rep_vector(0, J*(T-t)));

/* Priors for residual variances */
resvar ~ std_normal();

/* Priors for 'raw' factor scores and transform with multivariate
non-centered parameterisation */
to_vector(eta_r) ~ std_normal();
for(n in 1:N)
eta_t[n] = (e_Lcorr*eta_r[n]')';

/* Create the E[Y] for each individual */
lam_eta = eta_t*lam';
for(n in 1:N)
eta_vec[n] = lam_eta[n] + intercepts;

/* Link to observed data, with a multivariate normal distribution
for each pair of correlated observations */
for(c in 1:(T*(J-1))) {
matrix[2,2] y_cov[J*(T-1)];
y_cov[c] = diag_pre_multiply(resvar[{c,J+c}],y_Lcorr[c]);
y[,{c,J+c}] ~ multi_normal_cholesky(eta_vec[,{c,J+c}], y_cov[c]);
}
}

generated quantities {
/* Transform factor correlations from Cholesky to Correlation matrix */
corr_matrix[T] corr = multiply_lower_tri_self_transpose(e_Lcorr);

/* Put variance parameters into variance scale (from SD) */
vector[J*T] vars = square(resvar);
}

``````

I don’t fully understand your model, but there is some weird stuff going on, which might be me not uderstanding your model or real problems.

Most notably, the `resvar` variable is weird as some of its elements get reused for multiple `y_cov` elements while others don’t (the last `J` elements only appear in one matrix). Is this desired? I can imagine that the two uses of each element induce difficult geometry.

Also why is `y_cov` an array of matrices, wouldn’t the following work just as well?

``````for(c in 1:(T*(J-1))) {
matrix[2,2] y_cov;
y_cov = diag_pre_multiply(resvar[{c,J+c}],y_Lcorr[c]);
y[,{c,J+c}] ~ multi_normal_cholesky(eta_vec[,{c,J+c}], y_cov);
}
``````

Generally the strategy going forward would be to strip everything except for the covariances from the model and see if the problem keeps appearing there.

2 Likes

Oh good catch, I was getting too clever with my looping.

The model is a repeated measures factor analysis with autocorrelated residuals, so y1_t1 is correlated with y1_t2, and y1_t2 is correlated with y1_t3, etc. The aim was to specify pairwise multi-normal distributions, but I didn’t realise that some of the y’s and their variances were appearing in multiple `multi_normal` calls (e.g. y1_t2 is separately specified as `multi_normal` with both y1_t1 and y1_t3).

The working solution that I found was to ‘pack’ the individual cholesky factors into a single larger cholesky factor and use that in a single `multi_normal` call:

``````transformed parameters{
cholesky_factor_corr[J*T] y_corr = diag_matrix(rep_vector(1,J*T));

for(c in 1:(T*(J-1))) {
y_corr[c+J,c] = y_Lcorr[c][2,1];
y_corr[c+J,c+J] = y_Lcorr[c][2,2];
}
}

model{
...

y ~ multi_normal_cholesky(eta_vec, diag_pre_multiply(resvar,y_corr));
}
``````

That model runs exceedingly well and recovers the parameters nicely.

Thanks for catching that looping oddity, it’s always the little things that trip me up!