Multiple indexing from a multidimensional array to a vector

I have a vector of observations y (of length N) with associated indices (ii, jj, kk), each of length N, and an array of parameters theta[i, j, k] in my model, and I would like to write a vectorised sampling statement equivalent to

for (l in 1:N) y[l] ~ normal(theta[ii[l], jj[l], kk[l]], sigma);

Is it possible to vectorise this directly? If not, does it make sense to do it indirectly, somehow, like vectorizing (“raveling”) the three-dimensional array and computing the indices to the vector manually?

Not directly, the multiple indexing in Stan indexes all combinations of inputs. Check out the Matrix Indexing Table here: 6.8 Multiple Indexing and Range Indexing | Stan Reference Manual (the last row, concerning a[is, js], returns a matrix not a vector).

The only comment I would make is that, if this is the only time theta is used, then could theta be a single dimensional vector of length L? In that case vectorization would be possible. Is theta being used elsewhere?

1 Like

Also welcome to the forums!

Thanks for your reply. Yes, theta is used elsewhere. The last index is time, and there is an AR-like model over that index. And later there will be more structure over the other indices, maybe.

I saw the Matrix Indexing Table but suspected that there may be another syntax hiding somewhere. ;)

There seems to be to_array_1d— I guess I’ll try it and compute the index manually.

Ah, bummer.

For reference (if it helps), 3 dimensional arrays are C++ std::vectors of std::vectors of std::vectors. Stan Matrix and vector types are Eigen Matrix and Vector types. The biggest performance boost from vectorization is when there’s room for lots of shared calculation. I don’t think that’s true for a bunch of scalar normal_lpdf evaluations, but I don’t have a great feel for what’s gonna be fast and slow here. You could very well get some performance boost by shuffling things around and making the memory happier.

1 Like