3-dimensional object


I have a 3-dimensional object, beta[X,Y,Z]. I want to loop over the X-index and the Y-index while taking advantage of the vectorized operation on the z-index. Specifically, I am thinking:

for (x in 1:X) {
for (y in 1:Y) {
beta[x,y,] = … //standard operations done to a vector or a row-vector//

My question is: what should be the preferred data type I should use to assign beta?
I know that a 3-dimensional matrix type, Matrx[X,Y,Z] beta, does not exist in Stan. Hence I would like to seek advice on which data type I should assign for beta to facilitate what I am trying to do. Thanks.

But a 3-dimensional array type does exist: 5.5 Array Data Types | Stan Reference Manual


Yes, I know about the 3-dimensional array. But I have trouble understanding it because 1) I don’t know which kind of array I should declare and 2) I don’t know how to write the assignment properly, when I am trying to do:

beta[x,y,] = …

Any suggestion would be helpful.

Do you mean you don’t know whether to declare it as a real or integer?

See " Partial Array Assignment" within here: 5.5 Array Data Types | Stan Reference Manual

Thanks! Re-reading that section helps. I have another follow-up question, if I may:

real beta[X,Y,Z];
real sum_middle_beta[X,Z];
real sum_beta [Z];

Without doing for-loops, is there a sum command over just the middle index so that I can do something like: sum_middle_beta= sum(beta, 2)?
Or perhaps even sum_beta = sum(sum(beta,2),1)?

In the program that I am writing, Z is the largest index. So I am doing what I can to vectorize, especially I need to avoid any for-loop over Z. Thanks.

If you want to see what functions Stan has, look here: https://mc-stan.org/docs/2_19/functions-reference/index.html

Also, don’t worry too much about vectorizing. Yes, it can speed things up somewhat, but loops in Stan are much faster than loops in Python or R. If you aren’t careful, you may end up spending more time figuring out how to vectorize than you’s spend running your un-vectorized Stan model!

This is what I thought at first. Unfortunately, this turns out to be wrong for what I am doing. My model is a hierarchical binomial or multinomial regression. I had a version of the code that has a for-loop over every observation in the model block. It worked fine when the number of subjects was small (<300). But it took way too long when the number of subjects got bigger (~10000). So I am doing a lot of known improvements, including vectorization. For the record, the vectorized binomial model worked well as of my team’s latest testing. So I am moving to multinomial. This is why I am trying to do all these row-sums and/or column-sums. I know the direct functions do not exist in Stan. I would have to multiply by some matrix of 1’s somehow. I just cannot figure out the best way, especially for a 3-dimentional matrix.

Thanks for your help so far.

If you or someone can tell me the best way to do these, it would be much appreciated.

sum_middle_beta= sum(beta, 2) ?
even sum_beta = sum(sum(beta,2),1) ?

Can you show us this code? It may be that it had other bottlenecks besides the ones you think it has.

Unfortunately I can’t.

I just want to know how to do some similar basic computation. Here is another one:

vector[N] beta_part[K] // an array of K vectors of N elements
vector[N] beta // a vector of N elements

Can I do the following?


You are asking questions that can easily be answered from the manual. I strongly suggest that you read through it.

See here: 5.5 Array Data Types | Stan Reference Manual

See here: 5.6 Reductions | Stan Functions Reference

Maybe I just could not understand the manual. I know I could declare an array of vectors and matrices. I know I could sum multiple vectors together.
But the manual did not specify how one may sum a bunch of vectors that are part of an array.

That’s a fairly good indicator that there isn’t one.

If N \gg K, then this should probably be good enough:

beta = beta_part[1];
for (i in 2:K) {
   beta += beta_part[i];