Vectorized Matrix/Vector Multiplication with Factors

Hi Stan gang!

In Stan User’s Guide 22.8 Vectorization, at the bottom of the Vectorization through matrix operations section, they give the following “most efficient” code:

data {
  matrix[N, K] x;
  vector[N] y;
}
parameters {
  vector[K] beta;
}
model {
  y ~ normal(x * beta, 1);
}

Where they have a single column vector of betas, being multiplied by a matrix x, to create the row vector y (and some noise).

Suppose instead that I had a column vector of length N called “species” (converted to some numerical representation), that ranged from 1 to n_species. In this case, each row i of the matrix X is associated with a particular species (species[i]). Suppose that I want to generate different sets of betas for each species. I would create an array of beta column vectors, i.e.:

data {
  int<lower = 1> n_species;
  matrix[N, K] x;
  vector[N] y;
  vector[N] species;
}
parameters {
  vector[K] beta[n_species];
}

I’m stuck on how to keep this encapsulated as a single matrix/vector multiplication. My initial thought was:

model {
  y ~ normal(x * beta[,species], 1);
}

Such that the beta matrix is subset by the corresponding set of betas for species for observation i. However, this ends up just creating a new matrix with dimensions K x N, so my result is an N x N matrix, rather than an N x 1 column vector.

I could just create an N x K matrix of betas, with each row being the proper set of betas for that given species, and just do rowwise dotproduct, however this seems super inefficient. What would be the best practice to do here?

Thanks in advanced!

EDIT: I should mention too, it just so happens that my X matrix is ordered based on the species factor. That is, the first, say, 100 rows are ALL species 1, then the next 200 rows are ALL species 2, then the next 143 rows are ALL species 3, and so on. So, one other way I thought about doing this was when generating betas for species 1 for example, use the segment() function to apply this set of betas to the segment of X that are associated with species 1, and then move on to the next segment for species 2, in a similar fashion to dealing with ragged data structures. But, I get a little sketched out if I’m removing some generality like this, as it assumes that the data is going to be ordered by species. Mind you, I can’t see that assumption changing, but still…

1 Like

Hi Brandon,

Given your base model:

data {
  int N;
  int K;
  int<lower = 1> n_species;
  matrix[N, K] x;
  vector[N] y;
  int species[N];
}
parameters {
  vector[K] beta[n_species];
}

To construct the normal likelihood in a vectorized way, you could first declare beta as an N_{species} \text{ x }K matrix:

parameters {
  matrix beta[n_species, K];
}

If you then index this using the species variable:

beta[species]

This gives an N \text{ x }K matrix containing the relevant beta coefficients for a given individual. If call this with rows_dot_product:

rows_dot_product(x, beta[species]) 

This will give the vector of mu values to pass to normal.

As a full model, it would look like:

data {
  int N;
  int K;
  int<lower = 1> n_species;
  matrix[N, K] x;
  vector[N] y;
  int species[N];
}
parameters {
  matrix[n_species, K] beta;
}
model {
  y ~ normal(rows_dot_product(x, beta[species]) , 1);
}

EDIT: updated to use rows_dot_product instead of post-multiplication

6 Likes

Good morning Andrew,

This seems like it is the way to go! Looks like I was on the right track with my initial thought.

Just want to clarify something, after testing in R. You say that I should index the beta matrix using the species variable like this:

  beta[species]

However, since this is a matrix, should I actually be indexing like:

  beta[species,]

noting the addition of the comma? Simply because I want to index that first dimension, which is the species dimension.

Thanks again!

1 Like

Both styles of indexing (with comma and without) will produce the same result, since stan assumes that you want the entire row when you use a single index

1 Like

That’s pretty slick! Thanks for your help. I’m brand-spanking new to Stan, coming over from the JAGS world, so still getting accustomed to what can and cannot be done with Stan. But I have to say, I’m having a great time so far! :)

2 Likes