Slice an array of vectors across the first dimension (as row() for a matrix)

Hello Stanimals,

Given an array of vectors, how can one efficiently extract the n’th element from each vector in that array. The application is a hierarchical model a set of two correlated random effects that determine the group-level mean (random effect #1) and within-group variance (random effect #2). These correlated group-level random effects are modelled as a function of a multivariate normally-distributed parameter rfx_district that is implemented in my model code as an array of vectors:

parameters {
  vector[2] rfx_district[N_group] ;       // Group-level random effects
  cholesky_factor_corr[2] L_omega;        // Correlation matrix for rfx (Cholesky decomposition)
  vector<lower=0>[2] sigma_rfx_district;  // Scales of rfx
}
model {
  rfx_district ~ multi_normal_cholesky(mu_zero, diag_pre_multiply(sigma_rfx_district, L_omega));  //   mu_zero is specified in transformed data as a vector of zeroes
}

To efficiently construct the group-level mean and variance parameters, I would somehow like to index or slice the array of vectors such that I get a vector or an array or reals of length equal to the length of the original array, with every element containing the n’th value of each vector in the original array of vectors. In other words, what is the most efficient way to slice an array of vectors across its first dimension as if it were a matrix (i.e., as one might apply ‘row’ to a matrix to extract a particular row)? Or can this only be achieved by iterating over each element of the array rfx_district?

Apologies if this has been asked before. I could not find pointers anywhere else on discourse.

I’m using rstan 2.21.2 in R version 4.0.3 on MacOS Catalina 10.15.7.

EDIT: typo corrected

If all you’re doing is extracting, using a loop is fine. So a simple Stan function to do this would be:

vector extract_each(vector[] x, int n) {
  int M = size(x);
  vector[M] y;
  for (m in 1:M) y[m] = x[m, n];
  return y;
}

That’s as efficiently as it can be done. A built-in wouldn’t be any faster. The inefficiency here is the memory non-locality (each array is independently allocated, so they’re not contiguous), not the loop. The problem is usually that you want to access both rows and columns efficiently, and that’s not really possible without paying the transpose cost.

1 Like