To_array_1d row-major and to_vector column-major ordering -- why?


Hi everyone,

I recently implemented a simple multi-class logistic regression in Stan, and wrote the following:

data {
  int n; // number of sites
  int k; // number of species
  int p; // number of predictors

  int y[n, k]; // number of observations for each species
  matrix[n, p] X; // design matrix of predictors
transformed data {
  int flattened_y[n * k] = to_array_1d(y);
parameters {
  matrix[p, k] beta_1; // species response to predictors
model {
  matrix[n, k] logit_response;
  logit_response = X * beta_1;

  to_vector(beta_1) ~ normal(0, 1);

  // Big issue here! to_vector does column major, while to_array_1d does row
  // major. Hence we need the transpose here...
  flattened_y ~ bernoulli_logit(to_vector(logit_response'));

I had no end of trouble because I hadn’t realised that to_array_1d and to_vector were different, with to_vector converting to column-major and to_array_1d to row-major format, scrambling everything. I managed to fix it with the transpose in the code above, but it took a while to track down! I was wondering whether there is a good reason for this.



It’s the order the matrix and array are stored in memory so it avoids copying. Unfortunately they don’t match.


Perhaps not good—we went with the default column-major ordering for the Eigen C++ library matrices, which we use for Stan. We have to have row-major ordering of arrays because they’re implemented as arrays of arrays (using std::vector under the hood).