Vectorization and vector unstacking

Dear all, I have a question regarding vectorization: I am fitting 3-dimensional data to the Poisson model and want to avoid loops. I am partially able to do it in the model block, but not to compute the linear predictor. Hence, three questions:

1). In the model block, where the linear predictor is expressed as a linear combination of covariates, which are all passed into the Stan program in a 4-dimensional array, is there a way to avoid indexing all the summands of the linear predictor manually, such as beta[173]*to_vector(to_array_1d(c[,173]))? (Side note: I am not willing to compute linear predictor as a separate quantity in transformed parameters block due to the large size of data that I am dealing with. It is large enough to make the loo package crash the R session after the draws are made).

2). In the generated quantities block, is there a way to avoid loops? As far as I understand, there is no vectorization available for poisson_log_lpmf. Please correct me if I am wrong.

3). Is this right, that there is currently no function, unstacking a vector into an array of three or more given dimentions?

Please find the relevant lines of the program below and thank you in advance!

data{
int<lower=1> n1;
int<lower=1> n2;
int<lower=1> n3;
int y[n1, n2, n3];
int<lower=1> p;
real c[n1, n2, n3, p];
}

parameters {
vector[p] beta;
}

model {
// likelihood
to_array_1d(y) ~ poisson_log(beta[1]*to_vector(to_array_1d(c[,1])) + beta[2]*to_vector(to_array_1d(c[,2])) +… )
}

generated quantities {
real log_lik[n1,n2,n3];
for (i3 in 1:n3)
for (i1 in 1:n1)
for (i2 in 1:n2)
log_lik[i1,i2,i3] = poisson_log_lpmf(y[i1,i2,i3]|to_row_vector(c[i1,i2,i3,])*beta);
}

Does this make sense for the first thing:

transformed data {
  matrix[p, n1 * n2 * n3] X;

  for(i in 1:p) {
    X[i] = to_row_vector(to_array_1d(c[:, :, :, i]));
  }
}

parameters {
  row_vector[p] beta;
}

model {
  to_array_1d(y) ~ poisson_log(beta * X);
}

Or is this the transformation you wanted to avoid? (Beware the indexing is different that your example! I don’t think to_array_1d(y) ~ poisson_log(beta[1]*to_vector(to_array_1d(c[,1])) + beta[2]*to_vector(to_array_1d(c[,2])) +… ) is quite valid – need more commas in the indexing into c).

  1. The _lpmf/_lpdf functions only ever output one value, so there’s no vectorization like that.

  2. I don’t think there is. I guess you’d have to roll your own. Make sure and get the row-major vs. column major stuff right (to_array_1d(real[…]) goes to row major, but to_array_1d(matrix) goes to column major).

Hope that helps!

Rather than all this reshaping in the model block, is there a way to declare the input data to use things more easily? If you really need to do this reshaping, it should be done in the transformed data block, which is only executed once.

If p == 2 as you seem to be assuming, you can just code it that way. Otherwise, if p > 2, the beta[3:p] are unconstrained.

You can vectorize poisson_log_lpmf, but it returns a sum. Loops aren’t slow in Stan, so don’t worry about them in generated quantities. The reason to avoid loops elsewhere in Stan is that we can share computations sometimes in calculating derivatives.

Right. But packing/unpacking won’t be slow at all in Stan. It all gets compiled down to C++. The things to avoid are unvectorized calls to arithmetic or probability functions in the model and transformed parameters blocks—those involve derivatives and can be sped up with vectorization.