Syntax for vectorizing with two indices?

Suppose I have a model term like beta[Age][State] where Age and State are each int arrays of the same length, indexes for the same thing. So beta[Age[n]][State[n]] is a scalar. If this were just one index, something like alpha[Age], I could use it in a vectorized formula. But putting beta[Age][State] fails, I think because it’s actually a vector-of-vectors. But it sort of “ought” to work in the sense that indexing it with [n, n] is like indexing alpha[Age] with [n].

I can work around it by defining a new vector in transformed parameters that does the double indexing. but that’s sort of painful since then I have to define a different one for post-stratification, etc. Or I can un-vectorize my sampling statement but that comes with a performance hit, I think. Is there any way to do this directly?

I could imagine writing a function vector diagV(vector[]) {...} to do it but I’m not sure how to write a function that operates at the index-level. If it builds the entire new vector I’m afraid it will compute the entire new vector each time a single element is required. Which would be bad, performance-wise, I think.

Any ideas are most welcome!

Hi Adam,

Unfortunately this kind of multi-indexing isn’t currently available. I’m working on a similar facility for matrices (using two integer arrays to extract a single vector), but I can add arrays of vectors/arrays to the implementation.

The simplest thing would be to define a similar function in pure Stan:

functions {
  vector array_to_vector(vector[] in_array, int[] ii, int[] jj) {
    int vec_size = num_elements(ii);
    vector[vec_size] out_vec;

    for (i in 1:vec_size) {
      out_vec[i] = in_array[ii[i]][jj[i]];
    }

    return out_vec;
  }
}

Which is then used as array_to_vector(beta, Age, State) to give the same result that you were hoping beta[Age][State] to return

Thanks for clarifying and giving code for the function. A followup question: if I included that function and then used it in a sampling statement like:
S ~ binomial_logit(T, alpha + arrayToVector(beta, Age, State)) would that be efficient? I guess I don’t really understand how Stan’s vectorizing works so it’s not clear to me if this is computed once when the whole vector is needed or once each time a single entry in the vector is required.

Also, I think the function I want is actually

vector indexBoth(vector[] vs, int N)   {
    vector[N] out_vec;
    for (i in 1:N)     {
      out_vec[i] = vs[i, i];
    }
    return out_vec;
  }

because in my case the vector expression already has the sub-indexes in it, as if I gave your function the beta[Age][State] as a first argument rather than beta. But that’s because the actual term is :
{eps_Age_State[State], -eps_Age_State[State]}[Age] which I do not know how to write in un-indexed form…

Also, I am running with it now and it is very very slow. Maybe I did something wrong but I somehow think the entire vector is getting computed much too often or something. It’s an order of magnitude slower than the version where I do the transformed parameters thing.

Thanks!

Can you post the full model that you’re trying to vectorise? I’m not entirely sure what you’re after

Sure! Here is the one using my version of the function you suggest:

functions {
  vector indexBoth(vector[] vs, int N)   {
    vector[N] out_vec;
    for (i in 1:N)     {
      out_vec[i] = vs[i, i];
    }
    return out_vec;
  }
}
data {
  int<lower=2> N_State;
  int<lower=1> N;
  int<lower=1> State[N];
  int<lower=2> N_Age;
  int<lower=1> Age[N];
  int<lower=0> T[N];
  int<lower=0> S[N];
}
transformed data {
  vector<lower=0>[N_State] beta_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    beta_State_weights[State[n]] += 1;
  }
  beta_State_weights /= N;
  vector<lower=0>[N_State] eps_Age_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    eps_Age_State_weights[State[n]] += 1;
  }
  eps_Age_State_weights /= N;
}
parameters {
  real alpha;
  real eps_Age;
  real<lower=0> sigma_State;
  vector[N_State] beta_raw_State;
  vector[N_State] eps_Age_State_raw;
  real<lower=0> sigma_Age_State;
}
transformed parameters {
  vector[N_State] beta_State;
  beta_State = sigma_State * beta_raw_State;
  vector[N_State] eps_Age_State;
  eps_Age_State = sigma_Age_State * eps_Age_State_raw;
}
model {
  alpha ~ normal(0, 2.0);
  eps_Age ~ normal(0, 2.0);
  sigma_State ~ normal(0, 2.0);
  beta_raw_State ~ normal(0, 1) ;
  dot_product(beta_State, beta_State_weights) ~ normal(0, 1.0e-2);
  sigma_Age_State ~ normal(0, 2.0);
  eps_Age_State_raw ~ normal(0, 1);
  dot_product(eps_Age_State, eps_Age_State_weights) ~ normal(0, 1.0e-2);
  S ~ binomial_logit(T, alpha + to_vector({eps_Age,  -eps_Age}[Age]) + beta_State[State] + indexBoth({eps_Age_State[State], -eps_Age_State[State]}[Age], N));
}

This is slow. It reports that it expects 1000 transitions to take 3679s! I’ve not yet let it run to completion.

Here’s the version with the “workaround”:

data {
  int<lower=2> N_State;
  int<lower=1> N;
  int<lower=1> State[N];
  int<lower=2> N_Age;
  int<lower=1> Age[N];
  int<lower=0> T[N];
  int<lower=0> S[N];
}
transformed data {
  vector<lower=0>[N_State] beta_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    beta_State_weights[State[n]] += 1;
  }
  beta_State_weights /= N;
  vector<lower=0>[N_State] eps_Age_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    eps_Age_State_weights[State[n]] += 1;
  }
  eps_Age_State_weights /= N;
}
parameters {
  real alpha;
  real eps_Age;
  real<lower=0> sigma_State;
  vector[N_State] beta_raw_State;
  vector[N_State] eps_Age_State_raw;
  real<lower=0> sigma_Age_State;
}
transformed parameters {
  vector[N_State] beta_State;
  beta_State = sigma_State * beta_raw_State;
  vector[N_State] eps_Age_State;
  eps_Age_State = sigma_Age_State * eps_Age_State_raw;
  vector[N] y_Age_State;
  for (n in 1:N)   {
    y_Age_State[n] = {eps_Age_State[State[n]], -eps_Age_State[State[n]]}[Age[n]];
  }
}
model {
  alpha ~ normal(0, 2.0);
  eps_Age ~ normal(0, 2.0);
  sigma_State ~ normal(0, 2.0);
  beta_raw_State ~ normal(0, 1) ;
  dot_product(beta_State, beta_State_weights) ~ normal(0, 1.0e-2);
  sigma_Age_State ~ normal(0, 2.0);
  eps_Age_State_raw ~ normal(0, 1);
  dot_product(eps_Age_State, eps_Age_State_weights) ~ normal(0, 1.0e-2);
  S ~ binomial_logit(T, alpha + to_vector({eps_Age,  -eps_Age}[Age]) + beta_State[State] + y_Age_State);
}

This runs as expected, claim of 1000 transitions in 25s and runs 1000 warmup/1000 sample in 150s.

The workaround is fine. But I am wondering if there is an efficient way to put the “diagonally-indexed” vector[] in the vectorized sampling statement. I’m generating this code from Haskell, and while it’s easy enough to do the workaround, there’s some book-keeping involved in using a different expression when sampling vs. in post-stratification. Until I hit this, that book-keeping was simpler and all around what gets put into the index arguments. But this requires a more complex bit of decision-making.