 Repeat rows of matrix according to values held in array

Hi all,
I’m very new to Stan, so apologies for what I suspect is a simple problem - I haven’t managed to find a solution that works for my code on any of the online help that I have checked.

I’m trying to write a function for use in my Stan program that will take two arguments: (1) a matrix X of N rows, p columns, and (2) an array m (of length N) of integer values showing the number of times to repeat each row of X (I know that in Stan vectors hold reals not integers, so I can’t pass m as a vector?). I would like the function to output a matrix Xout that has number of rows equal to sum(m) and p columns - so the ith row of X (denoted X[i,]) will be repeated m[i] times, creating a “long” version of the matrix. I cannot do this outside stan, as X will contain values that will update during model fit.

I have tried a range of things, each of which gives errors (e.g. (1) passing a vector m of reals in, and extracting each element, assigning to a temporary int value (not shown), or (2) passing in m as an array of integers (non-working code below)). I have been using rep_matrix to repeat the rows of the matrix.

The best I have so far is (I think):

matrix matrixlongver(matrix X, int m[]){
row_vector[cols(X)] Xtemp = X[1,];
matrix[m,cols(X)] Xout = rep_matrix(Xtemp,m);
for(i in 2:rows(X)){
Xtemp = X[i,];
matrix[m[i],cols(X)] Xtemp2 = rep_matrix(Xtemp,m[i]);
Xout = append_row(Xout,Xtemp2)
}
return Xout;
}

Which gives me error message:

"PARSER EXPECTED: <argument declaration or close paren ) to end argument declarations>
Error in stanc(file = file, model_code = model_code, model_name = model_name, : "

Would it be possible to have any help / hints / tips / tricks / pointers to already solved problems for the best way of going about this?
Many thanks

Hi,
it looks like the C-like nature of Stan is biting you (which happens very easily :-) and the error messages are not very helpful either. Anyway, there are multiple small issues with your syntax and some problems with your logic.

• int m[] for parameter declarations you need int[] m (no good reason, but this is the way thing are in C)
• Stan allows variable declarations only at the very beginning of each block ( { ... } ), so instead of
for(i in 2:rows(X)){
Xtemp = X[i,];
matrix[m[i],cols(X)] Xtemp2 = rep_matrix(Xtemp,m[i]);

you need to write:

for(i in 2:rows(X)){
matrix[m[i],cols(X)] Xtemp2;
Xtemp = X[i,];
Xtemp2 = rep_matrix(Xtemp,m[i]);
• Finally, sizes of matrices and vectors are fixed in Stan, so X = append_row(X, whatever); cannot work as that would change the size of X, you need to preallocate the full size matrix and then assign to it.

Here’s how I would write that function:

matrix matrixlongver(matrix X, int[] m){
matrix[sum(m),cols(X)] Xout;
int next_row = 1;
for(i in 1:rows(X)){
if(m[i] < 0) {
reject("m has to be positive")
}
Xout[next_row:(next_row + m[i] - 1),] = rep_matrix(X[i,], m[i]);
next_row = next_row + m[i];
}
return Xout;
}

and a piece of R code showing that it seems to work OK:

stan_code <- '
functions {
matrix matrixlongver(matrix X, int[] m){
matrix[sum(m),cols(X)] Xout;
int next_row = 1;
for(i in 1:rows(X)){
if(m[i] < 0) {
reject("m has to be positive")
}
Xout[next_row:(next_row + m[i] - 1),] = rep_matrix(X[i,], m[i]);
next_row = next_row + m[i];
}
return Xout;
}
}'

expose_stan_functions(stan_model(model_code = stan_code))
in_matrix <- matrix(1:5, nrow = 3, ncol = 2)
in_matrix
m <- c(0,2,3)
m
matrixlongver(in_matrix, m)

Note that this will give you the following message:

DIAGNOSTIC(S) FROM PARSER:
Info: left-hand side variable (name=next_row) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

But this is safe to ignore here.

Hope that helps!