Dimension mismatch when passing array of matrix from R to Stan

After trying several times, I found that when passing array of matrix from R to Stan, the dimensions mismatch.

For example, in R, I construct a one-dimensional array of size 1 containing a value that is a 2\times 2 matrix. The R code is as the following:

I <- 2
J<-2
T <- 1
X <- array(c(1,2,3,4),dim=c(I,J,T))
fit = stan('test_pass_matrix.stan',data=list(X=X,I=I,J=J,T=T))

In the Stan, I declare an array of the same size containing a Matrix with the same dimension as the that in the R code, as the following:

data {
  int<lower=1> I;
  int<lower=2> J;
  int<lower=1> T;
  matrix[J,I] X[T];
}

After running the R code, an error of dimension mismatch appears, as the following: **Error in mod$fit_ptr() :
Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=X; position=0; dims declared=(1,2,2); dims found=(2,2,1) (in ‘modela515adfc8a6_test_pass_matrix’ at line 21)

failed to create the sampler; sampling not done**

The dimension of the array containing matrix is (2,2,1) in R, which should be the same as the dimension (1,2,2) of the array containing matrix declared in Stan. Also, the first index of an array declared in Stan is the last index of the array declared in R. But when passing the data from R to Stan, it seems that the Stan code regards this difference in syntax of dimension between R and Stan as an error in dimension mismatch.

Is there any expert or developer of Stan who can help confirm whether this is an issue? Alternatively, is there any solution to this error of dimension mismatch?

I think arrays of matrices have the size of the array first, which is why Stan wants something with shape (1,2,2). It is slightly weird as that dimension comes last in the declaration. I think there is a more intuitive ve way of declaring this type/shape, but I have forgotten it.

Edit: This! Index error when using new ODE interface - #2 by rok_cesnovar

I have repeatedly searched for it and failed to find it in the documentation.

In R, the dimension of a one-dimensional array of size 1 containing a value that is a 3*3 matrix is (3,3,1). Note that the length of the array is 1, which is the last index of (3,3,1).

However, in Stan, the dimension of the same array is (1,3,3). Note that here the first index “1” refers to the length of the array.

I tried to declare an array containing matrix with a dimension of (1,3,3) in R, and indeed the Stan code didn’t report error in dimension mismatch. However, the (1,3,3) array in R is not what I need, even though the data can be passed to the Stan code without incurring any error.

I believe that this is indeed a bug of the Stan. But I’m not sure whether the same issue happens in other programming languages, like Python, Java…

Hmmmm… Does this related question help? Pass two-dimensional array of matrices into RStan

It looks like the issue is that in R it’s natural to have the last index vary slowest, making it the preferred way to store arrays with k matrices of size (m, n) as some object of size (m, n, k). I don’t know whether there is a way around having to adapt this part of the R code to Stan’s way, which is storing these matrices as some object of size (k, m, n). In Python (which I use) this is how I would do it anyways, so there is no issue here.

Something like this surely comes up regularly, maybe there exists a good solution somewhere.

I don’t think this solution solves the issue.
I’m trying to construct the desired array of matrix in Stan code, in particular, in the generated data block.
Thanks!

Construct the array in R in which ever order is most useful to you, then use aperm to reorder the array into the Stan order.

1 Like

But the array is 3-dimensional. I’ve created a workaround solution in another post.

There is no Stan error here. Looking at the Stan data declaration:

matrix[J,I] X[T];

You’re declaring that there should be T matrices which each have J rows and I columns (i.e., T,J,I).

But in your R syntax:

X <- array(c(1,2,3,4),dim=c(I,J,T))

You’re specifying an array of I matrices each with J rows and T columns.

As a simple example, this works without issue:

I <- 2
J <- 2
T <- 1
X <- array(c(1,2,3,4),dim=c(T,J,I))

modcode = "data {
  int<lower=1> I;
  int<lower=2> J;
  int<lower=1> T;
  matrix[J,I] X[T];
}
parameters {
  real y;
}
model {
  y ~ normal(0,1);
}"

t = stan(model_code = modcode,data = list(I=I,J=J,T=T,X=X))
1 Like

In R the first parameter in dim(a,b,c) is the number of rows for each matrix, the second parameter is the number of column, and last parameter is the length of the array. So, It’s different from the syntax in Stan.

发自我的iPhone

In R the first parameter in dim(a,b,c) is the number of rows for each matrix, the second parameter is the number of column, and last parameter is the length of the array.

But that’s not the case though?

See:

I <- 2
J <- 2
T <- 1
X <- array(c(1,2,3,4),dim=c(T,J,I))
> dim(X)
[1] 1 2 2

Indicates that the first dimension is T, the length of the array. Further, if you select the first element of the array, it returns a 2x2 matrix:

> X[1,,]
     [,1] [,2]
[1,]    1    3
[2,]    2    4

Which is the same behaviour as Stan

2 Likes

If you Google “R array of matrices” you seem to always get the pattern @diyinhaitang describes, see eg data structures - R creating an array of matrices - Stack Overflow

I guess this is due to R’s storage strategy, making the matrices contiguous in memory only in @diyinhaitang’s formulation?

You are right. I realized that the essence of transposition of a matrix is to maintain all the elements in the original matrix and change the order of dimensions. But which dimension is regarded as row, column, or length of the array is not that strict as long as all the elements in the matrix can be accessed when passed to Stan.

In most reference of R, the last dimension of an array is regarded as its length. However, I finally realized that which dimension is regarded as row, column, or length of the array is not that strict in R as long as all the elements in the matrix can be accessed when passed to Stan.

1 Like

Yes, it’s just that it appears to be customary in R to handle arrays of matrices as you described it first.