Matrix algebra: parameter with two indices

jgellar · August 6, 2018, 8:17pm

I have a model with a coefficient (theta) that I want to vary by two other categorical variables, group and time. Suppose there are G=20 groups and T=10 time points, then I can declare the theta parameter as

  matrix[T, G] theta ;

The group and time indices are passed in as data, as integer vectors of length N (number of observations). In the model step, the easy way to index the particular theta element that corresponds to each observation would be with a for loop in the model step, e.g.

real theta_use[N] ;
for(n in 1:N){
   theta_use[n] = theta[time[n], group[n]];
}

It seems this could be done more efficiently with matrix algebra, but I can’t figure out the best way. Any suggestions? I was thinking that I could pass in (from R) two “map” matrices of dummy variables, time_map and group_map, that are N\times T and N\times G respectively (no intercept). I’m just not sure what to do with these in Stan.

FWIW, this is just a small part of a much larger program, but I eliminated the rest of the details as they are not necessary for understanding this issue.

Thanks,
Jon

bbbales2 · August 7, 2018, 8:17am

I think for what you describe, the code you wrote is better. Doing row/element selections with linear algebra operations isn’t gonna be efficient. You’d be multiplying by vectors of mostly zeros.

jgellar · August 7, 2018, 1:42pm

I did some tests in Stan and it looks like you’re right. In R the opposite is true - the linear algebra is much more efficient than the for loop. But I think that’s because for loops are particularly slow in R.

Oh well.

Bob_Carpenter · August 15, 2018, 12:28am

Yes, loops are slow in R. Stan just compiles straight to C++. There’s a chapter in the manual on efficiency that explains why — the autodiff overhead of the multiply-by-zero in memory and time is high. But looping is fast.

Topic		Replies	Views
Vector braodcasting from matrix by multiple indexing for parameter with two 'layers' Modeling techniques , specification	5	308	September 10, 2020
Efficiency of design matrix multiplication vs. range indexing in the framework of a hierarchical model Modeling techniques	2	272	August 28, 2020
Multiple indexing from a multidimensional array to a vector Modeling	4	957	April 13, 2019
Increasing Stan efficiency by vectorizing for loop Modeling	6	508	October 9, 2022
Awkward (inefficient?) syntax with multidimensional parameter arrays Modeling	4	763	September 4, 2017

Matrix algebra: parameter with two indices

Related Topics