Matrix product

Given a row_vector of size M and a matrix of size MxN, I’m doing matrix product by row_vector * matrix.

I’m thinking if it’ll be faster or slower if we define a MATRIX of size NxM and a vector of size M, and then do matrix product by MATRIX * vector.

In my problem, M is small (e.g., = 10) but N may be large (e.g., = 10000). Very much appreciate if you could advise which way of matrix product will be faster.

This is far from my expertise, but since there are no replies so far I’ll give it a try; if I get it wrong enough maybe I’ll get yelled at and you’ll get a better answer.

I believe the efficiency of that in Stan will follow that of the libraries it uses in C++: Eigen (and also Boost, maybe). I’m not sure if that implementation of matrices as row- or column-major will make a difference for the speed of this operations in practice (especially since there are usually more costly operations at every iteration).

In practice, instead of wondering, you could just try both options and run some trials runs to see if you notice any difference. I’m guessing it won’t really make that much of a difference.

3 Likes

Eigen Matrices are column major in Stan.

3 Likes

so row_vector * matrix would be faster?

For a row_vector * matrix the row_vector is multplied by each column of the matrix. Since Eigen matrices are column major I’d assume that’s more performant. On the flip side matrix * vector multiplies each row of the matrix by the vector so it’s non-contiguous access. But eigen might have tricks for this so you may not even up with a difference either way. It’s always better just to benchmark these things

2 Likes

The adjoint of c = Ab is \bar{b} = A^T \bar{c}, so the gradient is efficient if you right-multiply. So who even knows

1 Like

Thank you! Do you mean below code with right-multiply will be faster?

data {
  int<lower=1> K;
  int<lower=1> N;
  vector[K] x[N];
  real y[N];
}
parameters {
  vector[K] beta;
}
model {
  for (n in 1:N)
    y[n] ~ normal(dot_product(x[n], beta), 1);
}

or below way?

data {
  matrix[N, K] x;
  vector[N] y;
}
parameters {
  vector[K] beta;
}
model {
  y ~ normal(x * beta, 1);
}

Not sure if right-multiply is good if N >> K. For example, N = 1 million, K = 3.
I’ll test both types of matrix multiplication to see performance.