Vectorized use of multi_normal_lpdf

Hi there,
I’m applying a high-dimensional multivariate gaussian mixture model to some epidemiological data and need to improve the computational speed. Essentially all of the computational time is spent in the multi_normal_lpdf loop, which I would ideally like to vectorize. However, in my case I think it would only work if multi_normal_lpdf returned the vector of values rather than the sum of the vector because I need to weight the individual values by another parameter. I’d really appreciate if there were any suggestions for workarounds or ways to speed this up?

Here is a much simplified 2-dimension example of the code. For context, I am using the model in up to 10 dimensions with thousands of data points and I impose some rules on the relationships between the many gaussian mean parameters to aid identifiability.

data {
 
 int N; 
 vector[2] y[N]; 
 vector[4] alpha;

}


parameters {
 
 simplex[4] theta; // gaussian weights
 vector[2] mu[4]; // gaussian means 
 cholesky_factor_corr[2] L[4]; // cholesky factor
 vector<lower=0>[2] sigma[4]; // standard deviations

}


transformed parameters {
  
  corr_matrix[2] R[4]; // correlation matrix
  matrix[2,2] Sigma[4]; // covariance matrix
  
  //--- generate covariance matrix ---//
  for(c in 1:4){
    R[4] = multiply_lower_tri_self_transpose(L[4]); // R = L*L'
    Sigma[4] = quad_form_diag(R[4], sigma[4]); 
  }
  
}


model {
 
  matrix[N,4] pC;

 // priors
 theta ~ dirichlet(alpha);
 mu[1,] ~ normal(0,0.2);
 mu[2,] ~ normal(1.5,0.2);
 for(c in 1:4) sigma[c] ~ normal(0,0.1);
 for(c in 1:4) L[c] ~ lkj_corr_cholesky(1);
 
 
 // likelihood
 for(n in 1:N){
   for(c in 1:4){
    pC[n,c] = log(theta[c]) + multi_normal_lpdf(y[n] | mu[c], Sigma[c]);
   }
  target += log_sum_exp(pC[n,]);
  }

}

Welcome! I might be missing something, but it seems like you can solve this by swapping the order of the loops. That is, vectorize over N inside a loop over c. If it’s compatible with your identifying constraints, you might be able to get some extra juice out of using multi_normal_cholesky rather than forming the covariance matrix yourself in transformed parameters.

Thanks for your response! Unfortunately, with the current implementation of the multi_normal_lpdf function, vectorizing over N cannot give me the same final target value that I need due to the non-linearity of log_sum_exp.
I think I would need a vectorized version of the function that returns the vector of lpdf values for all N, rather than the sum of these. Maybe there is a nice workaround that I’m missing? Otherwise, I wonder if this alternative vectorized version of the multi_normal_lpdf is something that could potentially be implemented?
Regarding the cholesky version for speed - I think this would leave my sigmas unidentifiable but I will give this another look, thanks!

1 Like

Yeah, you’re right. Sorry about that. Swapping the loop order would yield a model that assumes that every y is drawn from the same mu, such that the mixture is applied in bulk to all the points. I now see that you want each point to be sampled independently from the mixture, potentially arising from a different mu at each row.