Hierarchical Piecewise Linear Model (Extending Facebook Prophet forecasting tool)

I’m new to Stan and I am trying to fit a hierarchical piecewise linear model. The code is based on Facebook’s Prophet model, but I’m trying to extend it to be able to use information from different time series. Unfortunately, the sampling process is way too slow and I’m trying to speed it up. I believe one of the root causes is that in transformed parameters I’m using a for loop as I don’t know how to vectorize it:

transformed parameters {
  vector[T] mu;
  for (i in 1:T) {
    mu[i] = linear_trend(k[country_id[i]], m[country_id[i]], delta[country_id[i]], t[i], A[i], t_change);
  }
}

Also, I have read that it’s usually a good idea to reparametrize the model, but I’m not sure if that would help in this case.

The full code is shown below. Any help/reference is greatly appreciated.

functions {
  matrix get_changepoint_matrix(vector t, vector t_change, int T, int S) {
    // Assumes t and t_change are sorted.
    matrix[T, S] A;
    row_vector[S] a_row;
    int cp_idx;

    // Start with an empty matrix.
    A = rep_matrix(0, T, S);
    a_row = rep_row_vector(0, S);
    cp_idx = 1;

    // Fill in each row of A.
    for (i in 1:T) {
      while ((cp_idx <= S) && (t[i] >= t_change[cp_idx])) {
        a_row[cp_idx] = 1;
        cp_idx = cp_idx + 1;
      }
      A[i] = a_row;
    }
    return A;
  }

  // Linear trend function

  real linear_trend(
    real k,
    real m,
    vector delta,
    real t,
    row_vector A,
    vector t_change
  ) {
    return (k + A * delta) .* t + (m + A * (-t_change .* delta));
  }
}

data {
  int T;                // Number of samples
  vector[T] t;          // Time
  vector[T] y;          // Single vector of time series with the data from all the groups (countries)
  int S;                // Number of changepoints
  vector[S] t_change;   // Times of trend changepoints
  real<lower=0> tau;    // Scale on changepoints prior
  int<lower=0> num_country;    // Number of countries
  int<lower=1, upper=num_country> country_id[T]; // Country ID
}

transformed data {
  matrix[T, S] A;
  A = get_changepoint_matrix(t, t_change, T, S);
}

parameters {
  vector[num_country] k;    // Base trend growth rate
  real mu_k;                // Hyperprior on k
  vector[num_country] m;    // Trend offset
  real mu_m;                // Hyperprior on k
  vector[S] delta[num_country]; // Trend rate adjustments
  real<lower=0> sigma_k;    // Observation noise
  real<lower=0> sigma_m;    // Observation noise
  real<lower=0> sigma_obs;  // Observation noise
}

transformed parameters {
  vector[T] mu;
  for (i in 1:T) {
    mu[i] = linear_trend(k[country_id[i]], m[country_id[i]], delta[country_id[i]], t[i], A[i], t_change);
  }
}

model {
  //hyperpriors
  mu_k ~ normal(0, 5);
  sigma_k ~ normal(0, 5);
  
  mu_m ~ normal(0, 5);
  sigma_m ~ normal(0, 5);
  
  //priors
  k ~ normal(mu_k, sigma_k);
  m ~ normal(mu_m, sigma_m);
  for (c in 1:num_country) {
    delta[c] ~ double_exponential(0, tau);
  }
  sigma_obs ~ normal(0, 0.5);

  // Likelihood
  y ~ normal(mu,sigma_obs);
}
1 Like

Hi, sorry that your question slipped through, although it is well written. Did you manage to move forward?

Generally, I would first look on potential model misspecifications than vectorization as those tend to have the biggest impact on speed. Can the model recover parameters from (small-ish) simulated data? Do you get some warnings? Does the pairs plot look good (i.e. “Gaussian-like blobs everywhere”)?

Best of luck with your model!

1 Like