# About computation speed difference between multi_normal_lpdf and lpdf matrix operation

Dear all,

``````x \sim MVN(0, C I_d)
``````

I am wondering int terms of specifying the model in stan code, will writing out the log likelihood as a matrix be faster than running a for loop over all the data point?

I thought writing out the log likelihood as

``````target += - N/2 *(log_determinant (C) + 1/N * trace( inverse (C)  * crossprod(x)));
``````

will be faster due to the vectorization but in my experiments, it is taking even longer time than running a for loop over all the data point.

``````for(n in 1:N){
target +=  multi_normal_lpdf(x[n]|rep_vector(0,D), C);
}
}
``````

Can anyone explain?

One difference in terms of the setup I noticed is that when I define the input data x, in order to using the crossproduct, I write the first one as

``````matrix[N,D] x; // data
``````

``````vector[D] x[N]; //data
``````

All the others are the same, does this slow things down?

You should try the multi normal cholesky for which analytic gradients are in Stan.

To be more concrete, Stan’s `multi_normal_cholesky` distribution is both vectorised (so no looping required) and has analytical gradients (which means it samples more efficiently).

As a toy example, you would use it like the following:

``````data {
vector[D] x[N]; //data
}

transformed data {
vector[D] x_mu = rep_vector(0,D);
}

parameters {
cholesky_factor_corr[D] C;
}

model {
C ~ lkj_corr_cholesky(1);
x ~ multi_normal_cholesky(x_mu, C);
}
``````
2 Likes