Using map_rect for operations on each column of a matrix

Dear all,

I am quite new in Stan. I want to apply a function (that calculates one element of a vector using the value of the previous one) on each column of a matrix. Instead of doing the calculation one column after another, I am trying to implement map_rect where the shards are the columns of the matrix. Bellow is my Stan code with and without using map_rect.

Without map_rect :

functions {
  vector function1( real theta, vector x )
  {
    vector[rows(x)] x_transformed;
    x_transformed[1] = x[1];
    for ( i in 2:rows(x) ) {
      x_transformed[i] = x[i] + theta * x_transformed[i-1];
    }
    return x_transformed;
  }
  matrix function1Mat( vector theta, matrix x )
  {
    int N = rows(x);
    int K = cols(x);
    matrix[N, K] out;
    for ( i in 1:K ) {
      out[ 1:N, i ] = function1( theta[i], x[1:N, i] );
    }
    return out;
  }
}
data {
  int<lower = 0> N;
  int<lower = 0> K;
  matrix[N, K] X;
  vector[N] Y;
}
transformed data {
}
parameters {
  vector[K] beta;
  real<lower=0> sigma;
  vector<lower=0, upper=1>[K] theta;
  real<lower=0> alpha;
}
model {
  beta ~ normal(1, 1);
  theta ~ normal(0.5, 0.5);
  sigma ~ normal(0, 200);
  alpha ~ normal(100, 50);
  Y ~ normal(alpha + function1Mat(theta, X) * beta, sigma);
}

With map_rect :

functions {
  vector function1(vector phi, vector theta, real[] x_r, int[] x_i) {
    vector[size(x_r)] x_transformed;
    x_transformed[1] = x_r[1];
    for ( i in 2:size(x_r) ) {
      x_transformed[i] = x_r[i] + theta[1] * x_transformed[i-1];
    }
    return x_transformed;
  }
}
data {
  int<lower = 0> N;  // rows
  int<lower = 0> K;  // cols
  matrix[N, K] X;
  vector[N] Y;
}
transformed data {
  real x_rs[K, N] = to_array_2d(X');  // the shards are the columns of the input matrix
  int x_is[K, N];  // x_is is not used
  for (k in 1:K) {
    x_is[k, 1] = N;
    x_is[k, 2] = K;
  } 
}
parameters {
  vector[K] beta;
  real<lower=0> sigma;
  vector<lower=0, upper=1>[1] theta[K];
  real<lower=0> alpha;
  vector[0] phi;  // phi is not used
}
model {
  alpha ~ normal(100, 50);
  beta ~ normal(1, 1);
  for (i in 1:K) {
    theta[i] ~ normal(0.5, 0.5);
    }
  sigma ~ normal(0, 200);
  Y ~ normal( alpha + to_matrix(map_rect(function1, phi, theta, x_rs, x_is), N, K) * beta, sigma);
}

However, the calculation without map_rect is way faster than with map_rect…
Without map_rect : “Chain 1: Gradient evaluation took 0.000199 seconds”
With map_rect : “Chain 1: Gradient evaluation took 0.024525 seconds”

Is there a way to modify the code so that map_rect is effective to speed up the calculation in that case?

Thank you for your help.

Using parallelization comes with an overhead. Your columns would need to be really large or/and your by-column operation to be super costly. This does not seem to be the case given that your gradient runs not too slow.

Instead of going parallel maybe have a look at the normal_id_glm function for the likelihood… and if the data is largish then I would actually use reduce_sum for this to paralellize the likelihood.

… but wait… almost forgot: You should first profile with the latest profile feature in Stan 2.26 what makes your program run slow. Then optimize.

1 Like