How to pass an array of integer arrays of different lengths into Stan?

I’m facing the problem described in this post:

https://groups.google.com/forum/#!topic/stan-users/oX6Uq100_Y0

Where the most appropriate solution (given I can’t order by group) seems to be that proposed by @jonah to pass an array an integer arrays into Stan containing the indexes for each group.

I’m struggling to determine how to pass this array of arrays into Stan though, when the integer array of each group is a different length? Given I believe I have to define consistent dimensions of the array in the data block of Stan. Perhaps this is not even yet possible in Stan?

This came up (among other places) at


It is basically the same considerations with integer arrays.

1 Like

Thanks @bgoodri. Is it ok to put inf values in integer arrays? (as the padding option).

Would this method work as a simplified example of getting the inf values out of the vector before using the integer values to index a vector?

data {
   int N; //num observations
   int Ind[N, 5]; //Array of N integer arrays of length 5 padded with Inf (at the end when total non-inf values < 5)
   vector[20] Vals; //vector to subset 
}
model {
   for (i in 1:N) {
      int IndCurrRow[5] = Ind[i,1:5]; //select integer array for current row
      int NumNonInfs = sum(!is.inf(IndCurrRow)); //count number of non-infinity values in array
      int IndNoInf[NumNonInfs] = IndCurrRow[1:NumNonInfs]; //create integer array of length equal to non-Inf values in current integer array
      vector[NumNonInfs] SubsetVals = Vals[IndNoInf]; //subset Vals with indexes from IndNoInf
   }
}

Integers are limited to \mp 2^{31} \pm 1 so I would use the most negative one as a padding value.

Thanks @bgoodri so adjusting the code above to now use -1 as the padding value, I now get (and hopefully you think this looks sensible):

data {
   int N; //num observations
   int Ind[N, 5]; //Array of N integer arrays of length 5 padded with -1 (at the end when total non-Neg values < 5)
   vector[20] Vals; //vector to subset 
}
model {
   for (i in 1:N) {
      int IndCurrRow[5] = Ind[i,1:5]; //select integer array for current row
      int NumPositive = sum(IndCurrRow >= 0); //count number of positive values in array
      int IndPositive[NumPositive] = IndCurrRow[1:NumPositive]; //create integer array of length equal total positive values in current integer array
      vector[NumPositive] SubsetVals = Vals[IndPositive]; //subset Vals with indexes from IndPositive
   }
} 

I am not sure sum(IndCurrRow >= 0) is legal, but that ought to be something you can tabulate in the transformed data block.