Speeding up for loop with matrix - vector multiplication

stressfaktors · December 15, 2023, 5:59pm

Hi,
I apologize in advance for the very naive question. As part of my
Stan code I have a matrix A and an array of matrices B defined in the data
section as follows:

matrix[90000, 30] A;
array[30] matrix[90000, 12] B;

I also have a vector of 12 parameters called beta:

vector[12] beta;

I am currently performing the following multiplication in a for loop:

for(k in 1:30) {
Y = A[,k] - B[k] * beta;
}

Is there a more efficient/fast way of performing this calculation?
Thank you for entertaining my very beginner question.

Corey.Plate · December 15, 2023, 6:35pm

If you have access to a cluster or a large number of CPU, you can compile this with open mpi in cmdstan:

make STAN_THREADS=TRUE my_folder/my_model

functions {

 my_function(array[] matrix B,vector beta) {

  for (k in 1:30) {
   C[,k] = B[k]*beta;
  }

  return C;

 }

 my_partial_sum(array[] matrix slice_n_B, vector beta) {

 return my_function(slice_n_B,beta);

 }

}

model {

 C = reduce_sum(my_partial_sum,B,1,beta);

 Y = A - C;

}

If you don’t have access to one, you can at least pull A out of the loop:

model {

 for (k in 1:30) {
  C[,k] = B[k]*beta;
 }

 Y = A - C;

}

I think the parallelized code should work, but it is not tested/debugged.

stressfaktors · December 15, 2023, 6:41pm

Thank you so much for the suggestions Corey.Plate. I actually do have access to a cluster and I am just starting to try and figure out how to implement reduce_sum, so I will definitely try your suggestion and report back. However, before turning to reduce_sum I wanted to make sure that there was nothing else I could do to make my code as efficient as possible to begin with (e.g., by doing things like pulling A out of the loop, as you also suggested). My profiling shows that this for loop is a clear bottleneck in my code so I figured I would ask. Thanks again for responding!

Topic		Replies	Views
Increasing Stan efficiency by vectorizing for loop Modeling	6	680	October 9, 2022
STAN model is very slow when using large data (only uses one CPU!) General	3	402	September 6, 2023
Vectorization of sum of real*matrix terms - any ideas? Modeling	10	1621	May 16, 2017
RStan parallelising using reduce_sum() Modeling	1	447	July 30, 2021
Reduce_sum performance Modeling performance , paralellization	15	1342	September 28, 2020

Speeding up for loop with matrix - vector multiplication

Related topics