Stan function for converting a (wide) array to (long) matrix


#1

I am curious whether anyone has written a Stan function that converts an array of arbitrary dimensions to a “long” matrix whose first D columns indicate the D indices of the array and whose D+1th column indicates the value in each cell of the array. Figure 16.1 on p. 230 of the Stan manual (version 2.17.0) illustrates the conversion I am thinking of (though it also drops missing elements, which is not essential for my purposes). Obviously this conversion is easy to do in R with a function like reshape2::melt, but ideally I would also be able to do it within Stan as well. My sense from reading Stan forums is that there has been some demand for such a function in the past, but I wasn’t able to locate any workable solutions. Any help would be appreciated. Thanks!

Devin


#2

Not that I know of. We tend to do it on the outside as part of data prep.

There’s no way to write that kind of polymorphic function actually in the Stan user-defined function syntax. I t could be done in C++ and either hacked into the language as a specialized expression or defined for a finite set of instantiations.

What you are probably going to want out is not quite a long matrix, but an integer array of the indices and a real array or vector of values. And there’s no way to return multiple types like this until we have tuples.
What you’d need, say for a 3D array is:

data {
  ...
  real x[I, J, K];
}
transformed data {
  int N = I * J * K;
  int x_idx[N, 3]
  vector x_val[N];
  {
    int pos = 1;
    for (i in 1:I) {
      for (j in 1:J) {
        for (k in 1:K) {
          x_idx[pos] = { i, j, k };
          x_val[pos] = x[i, j, k];
          pos += 1;
        }
      }    
    }
  }
}

For neatness, I made pos a local variable. That’s not necessary for efficiency.


#3

Thanks, Bob. As always, I appreciate your swift and helpful reply. I had something like this in mind but couldn’t quite figure out how to implement it. I’ll work on trying to generalize it and email back if I come up with something useful. Thanks again!

Devin


#4

I really don’t think there’s an easy way to generalize given the way Stan works (the problem is setting x_val and increment pos in a function).

It would be easy enough to write a general C++ template metaprogram to do this—it’s what a lot of our I/O and vectorized operations do under the hood.