Vectorizing sampling of arrays of vectors

I have an array of vectors as parameters, and I’m wondering how to vectorize setting the priors of the parameters. Below is example code which illustrates the problem I’m facing.

data {
  int<lower=0> N;
  vector[3] y[N];
  cov_matrix[3] SD[N];

  vector[3] example_mu[N];

  vector[3] filler;
  vector[3] m[N];
  vector[3] s[N];
  filler[1] = 1;
  filler[2] = 2;
  filler[3] = 3;

  m = rep_array(filler,N);
  filler[1] = 2;
  filler[2] = 3;
  filler[3] = 4;
  s = rep_array(filler,N);

  // this isn't allowed but I want to vectorize my vector operations 
  example_mu ~ normal(m,s);
  // otherwise I'd have to loop through like this:
  for(i in 1:N){
    example_mu[i] ~ normal(m[i], s[i]);

  // what other options do I have?? 
  y ~ multi_normal(example,SD);


I don’t think there is a lot you can do differently because you need an array for the multi_normal function. Your approach is vectorizing each vector in the array. I think the most important benefit from vectorizing comes from reusing values. So even if example_mu ~ normal(m, s); you would not get the full benefits of vectorizing because m[i] <> m[j].

I think a big part of the problem is that the for loops in sampling add a lot of time. I think if I was able to make all the sampling into one vectorized operation then it only takes the gradient once of that vector vs. in the for loop it is taking N gradients which is a lot more computationally expensive. I tried out my actual model out with the flor loop and it took excessively long. I think that it won’t be possible to actually use that model unless I find a way to get rid of the for loop.

If you declare the parameters as a matrix like

parameters {
  matrix[N, 3] example_mu;

then you can use the to_vector function in the model block like

model {
    to_vector(example_mu) ~ normal(...);

but the arguments to normal have to either be scalars or vectors that are N \times 3.

Ok that makes sense, but can you have the standard deviation argument be a N x 3 matrix and will stan know that (for example) the (4,2) nd entry of that is the covariance between example_mu[4] and example_mu[5] ?

No, if you are using normal (as opposed to one of the multi_normal variants), it assumes the covariances are zero.

1 Like

Probably good to clarify. The original model has two log likelihood statements

to_vector(example_mu) ~ normal(m, s);
y ~ multi_normal(example_mu, SD);

The first line has no correlations, the second line has the corrrelations implied by SD.

Unfortunately, this is not going to work because example_mu is a matrix and multi_normal needs an array of vectors. I think you will need to make a new example_mu_array and copy the vectors from the matrix in the array in a for loop.