About computation speed difference between multi_normal_lpdf and lpdf matrix operation

Dear all,

x \sim MVN(0, C I_d) 

I am wondering int terms of specifying the model in stan code, will writing out the log likelihood as a matrix be faster than running a for loop over all the data point?

I thought writing out the log likelihood as

target += - N/2 *(log_determinant (C) + 1/N * trace( inverse (C)  * crossprod(x))); 

will be faster due to the vectorization but in my experiments, it is taking even longer time than running a for loop over all the data point.

for(n in 1:N){
target +=  multi_normal_lpdf(x[n]|rep_vector(0,D), C);
}
}

Can anyone explain?

One difference in terms of the setup I noticed is that when I define the input data x, in order to using the crossproduct, I write the first one as

matrix[N,D] x; // data

instead of

vector[D] x[N]; //data

All the others are the same, does this slow things down?

You should try the multi normal cholesky for which analytic gradients are in Stan.

To be more concrete, Stan’s multi_normal_cholesky distribution is both vectorised (so no looping required) and has analytical gradients (which means it samples more efficiently).

As a toy example, you would use it like the following:

data {
  vector[D] x[N]; //data
}

transformed data {
  vector[D] x_mu = rep_vector(0,D);
}

parameters {
  cholesky_factor_corr[D] C;
}

model {
  C ~ lkj_corr_cholesky(1);
  x ~ multi_normal_cholesky(x_mu, C);
}
2 Likes