Syntax for vectorizing with two indices?

adamConnerSax · April 7, 2021, 9:10pm

Suppose I have a model term like beta[Age][State] where Age and State are each int arrays of the same length, indexes for the same thing. So beta[Age[n]][State[n]] is a scalar. If this were just one index, something like alpha[Age], I could use it in a vectorized formula. But putting beta[Age][State] fails, I think because it’s actually a vector-of-vectors. But it sort of “ought” to work in the sense that indexing it with [n, n] is like indexing alpha[Age] with [n].

I can work around it by defining a new vector in transformed parameters that does the double indexing. but that’s sort of painful since then I have to define a different one for post-stratification, etc. Or I can un-vectorize my sampling statement but that comes with a performance hit, I think. Is there any way to do this directly?

I could imagine writing a function vector diagV(vector[]) {...} to do it but I’m not sure how to write a function that operates at the index-level. If it builds the entire new vector I’m afraid it will compute the entire new vector each time a single element is required. Which would be bad, performance-wise, I think.

Any ideas are most welcome!

andrjohns · April 8, 2021, 12:10am

Hi Adam,

Unfortunately this kind of multi-indexing isn’t currently available. I’m working on a similar facility for matrices (using two integer arrays to extract a single vector), but I can add arrays of vectors/arrays to the implementation.

The simplest thing would be to define a similar function in pure Stan:

functions {
  vector array_to_vector(vector[] in_array, int[] ii, int[] jj) {
    int vec_size = num_elements(ii);
    vector[vec_size] out_vec;

    for (i in 1:vec_size) {
      out_vec[i] = in_array[ii[i]][jj[i]];
    }

    return out_vec;
  }
}

Which is then used as array_to_vector(beta, Age, State) to give the same result that you were hoping beta[Age][State] to return

adamConnerSax · April 8, 2021, 12:21am

Thanks for clarifying and giving code for the function. A followup question: if I included that function and then used it in a sampling statement like:
S ~ binomial_logit(T, alpha + arrayToVector(beta, Age, State)) would that be efficient? I guess I don’t really understand how Stan’s vectorizing works so it’s not clear to me if this is computed once when the whole vector is needed or once each time a single entry in the vector is required.

Also, I think the function I want is actually

vector indexBoth(vector[] vs, int N)   {
    vector[N] out_vec;
    for (i in 1:N)     {
      out_vec[i] = vs[i, i];
    }
    return out_vec;
  }

because in my case the vector expression already has the sub-indexes in it, as if I gave your function the beta[Age][State] as a first argument rather than beta. But that’s because the actual term is :
{eps_Age_State[State], -eps_Age_State[State]}[Age] which I do not know how to write in un-indexed form…

Also, I am running with it now and it is very very slow. Maybe I did something wrong but I somehow think the entire vector is getting computed much too often or something. It’s an order of magnitude slower than the version where I do the transformed parameters thing.

Thanks!

andrjohns · April 8, 2021, 2:35am

Can you post the full model that you’re trying to vectorise? I’m not entirely sure what you’re after

adamConnerSax · April 8, 2021, 12:54pm

Sure! Here is the one using my version of the function you suggest:

functions {
  vector indexBoth(vector[] vs, int N)   {
    vector[N] out_vec;
    for (i in 1:N)     {
      out_vec[i] = vs[i, i];
    }
    return out_vec;
  }
}
data {
  int<lower=2> N_State;
  int<lower=1> N;
  int<lower=1> State[N];
  int<lower=2> N_Age;
  int<lower=1> Age[N];
  int<lower=0> T[N];
  int<lower=0> S[N];
}
transformed data {
  vector<lower=0>[N_State] beta_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    beta_State_weights[State[n]] += 1;
  }
  beta_State_weights /= N;
  vector<lower=0>[N_State] eps_Age_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    eps_Age_State_weights[State[n]] += 1;
  }
  eps_Age_State_weights /= N;
}
parameters {
  real alpha;
  real eps_Age;
  real<lower=0> sigma_State;
  vector[N_State] beta_raw_State;
  vector[N_State] eps_Age_State_raw;
  real<lower=0> sigma_Age_State;
}
transformed parameters {
  vector[N_State] beta_State;
  beta_State = sigma_State * beta_raw_State;
  vector[N_State] eps_Age_State;
  eps_Age_State = sigma_Age_State * eps_Age_State_raw;
}
model {
  alpha ~ normal(0, 2.0);
  eps_Age ~ normal(0, 2.0);
  sigma_State ~ normal(0, 2.0);
  beta_raw_State ~ normal(0, 1) ;
  dot_product(beta_State, beta_State_weights) ~ normal(0, 1.0e-2);
  sigma_Age_State ~ normal(0, 2.0);
  eps_Age_State_raw ~ normal(0, 1);
  dot_product(eps_Age_State, eps_Age_State_weights) ~ normal(0, 1.0e-2);
  S ~ binomial_logit(T, alpha + to_vector({eps_Age,  -eps_Age}[Age]) + beta_State[State] + indexBoth({eps_Age_State[State], -eps_Age_State[State]}[Age], N));
}

This is slow. It reports that it expects 1000 transitions to take 3679s! I’ve not yet let it run to completion.

Here’s the version with the “workaround”:

data {
  int<lower=2> N_State;
  int<lower=1> N;
  int<lower=1> State[N];
  int<lower=2> N_Age;
  int<lower=1> Age[N];
  int<lower=0> T[N];
  int<lower=0> S[N];
}
transformed data {
  vector<lower=0>[N_State] beta_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    beta_State_weights[State[n]] += 1;
  }
  beta_State_weights /= N;
  vector<lower=0>[N_State] eps_Age_State_weights = rep_vector(0,N_State);
  for (n in 1:N)   {
    eps_Age_State_weights[State[n]] += 1;
  }
  eps_Age_State_weights /= N;
}
parameters {
  real alpha;
  real eps_Age;
  real<lower=0> sigma_State;
  vector[N_State] beta_raw_State;
  vector[N_State] eps_Age_State_raw;
  real<lower=0> sigma_Age_State;
}
transformed parameters {
  vector[N_State] beta_State;
  beta_State = sigma_State * beta_raw_State;
  vector[N_State] eps_Age_State;
  eps_Age_State = sigma_Age_State * eps_Age_State_raw;
  vector[N] y_Age_State;
  for (n in 1:N)   {
    y_Age_State[n] = {eps_Age_State[State[n]], -eps_Age_State[State[n]]}[Age[n]];
  }
}
model {
  alpha ~ normal(0, 2.0);
  eps_Age ~ normal(0, 2.0);
  sigma_State ~ normal(0, 2.0);
  beta_raw_State ~ normal(0, 1) ;
  dot_product(beta_State, beta_State_weights) ~ normal(0, 1.0e-2);
  sigma_Age_State ~ normal(0, 2.0);
  eps_Age_State_raw ~ normal(0, 1);
  dot_product(eps_Age_State, eps_Age_State_weights) ~ normal(0, 1.0e-2);
  S ~ binomial_logit(T, alpha + to_vector({eps_Age,  -eps_Age}[Age]) + beta_State[State] + y_Age_State);
}

This runs as expected, claim of 1000 transitions in 25s and runs 1000 warmup/1000 sample in 150s.

The workaround is fine. But I am wondering if there is an efficient way to put the “diagonally-indexed” vector[] in the vectorized sampling statement. I’m generating this code from Haskell, and while it’s easy enough to do the workaround, there’s some book-keeping involved in using a different expression when sampling vs. in post-stratification. Until I hit this, that book-keeping was simpler and all around what gets put into the index arguments. But this requires a more complex bit of decision-making.

Topic		Replies	Views
Multiple indexing from a multidimensional array to a vector Modeling	4	1039	April 13, 2019
Using integer array to index into a vector General	2	3102	July 24, 2018
Vectorized Matrix/Vector Multiplication with Factors General matrix	4	840	February 2, 2021
Issues with indexing and performing mathematical operations on three-dimensional array Modeling rstan , specification	3	683	January 9, 2022
Increasing Stan efficiency by vectorizing for loop Modeling	6	662	October 9, 2022

Syntax for vectorizing with two indices?

Related topics