Bayesian estimation for association models with multiplicative predictors

I would like to know if it is possible to fit the Lee-Carter model in a Bayesian setting. This model is used to forecast population mortality dynamics and has the following form:
log (\mu_{xt})=\alpha_x+\beta_x \kappa_t ,
where \beta_x and \kappa_t are factors for age and year, respectively. Due to the multiplication of predictors, this model does not fit in the GLM framework and cannot be fitted with the stan_glm function of the rstanarm package. In the frequentist setting, this model can be fitted with the gnm package but I couldn’t find a Bayesian equivalent.

Do you know if this can easily be implemented in Stan?

Thank you very much for your help!

Here is the start of some code to do a Lee-Carter model:

data {
  int<lower = 1> J;                    // number of age categories
  int<lower = 1> T;                    // number of years
  matrix[J, T] m;                      // mortality matrix
  int<lower = 1, upper = T> anomaly;   // when does the pandemic occur
}
transformed data {
  vector[J * T] y = to_vector(log(m));
}
parameters {
  vector[J] a;
  simplex[J] b;                        // strictly positive and sums to 1
  
  real alpha;                          // drift in random walk
  real beta;                           // impact of the pandemic
  vector[T] epsilon;                   // unscaled errors in the random walk
  
  real<lower = 0> sigma[2];            // error standard deviations
}
transformed parameters {
  vector[J] k;
  k[1] = alpha + epsilon[1];
  for (t in 2:(anomaly - 1)) k[t] = k[t - 1] + alpha + sigma[1] * epsilon[t];
  k[anomaly] = k[anomaly - 1] + alpha + beta + sigma[1] * epsilon[anomaly];
  for (t in (anomaly + 1):T) k[t] = k[t - 1] + alpha + sigma[1] * epsilon[t];
}
model {
  vector[J * T] mu;
  int pos = 1;
  for (t in 1:T) for (x in 1:J) {
    mu[pos] = a[x] + b[x] * k[t];
    pos += 1;
  }
  target += normal_lpdf(y | mu, sigma[2]);
  target += std_normal_lpdf(epsilon);
  target += exponential_lpdf(sigma | 1);
}

This does not have the restriction that the k sums to zero. Nor does it have a generated quantities block with forecasts yet.

EDITS:

  • Corrected the sizes of b and epsilon
2 Likes

Thank you very much for the start of the code! I would have three small comments regarding the code:

  • I think we should have : simplex[J] b and vector[T] epsilon since the first one refers to ages and the second to years.
  • In the definition of k, shouldn’t we have sigma[2] instead of sigma[1] since there is no defined sigma[1].
  • In the last line, shouldn’t we have: target += exponential_lpdf(sigma[2]| 1)

You’re correct. I just now edited the code to reflect that.

The real<lower = 0> sigma[2] declaration in the parameters block is an array of size two positive numbers. You can think of it as \sigma_1 and \sigma_2. \sigma_1 is the scale on the innovations in the random walk for k and \sigma_2 is the scale of the errors in m.

That just puts an exponential prior with a rate of 1 on both \sigma_1 and \sigma_2, but that is definitely something that could be changed.

1 Like

Thanks for the reply! The code works with the small change vector[T] k in the transformed parameters’ block as it refers to years.

Concerning the sum-to-zero constraint, what is the optimal method to impose that the sum of k is zero? If we impose that the last observation of the time series is the subtraction of the previous elements, I am not sure it is correct (= one less random noise in the sequence).

Thanks in advance!

I would say to not impose the sum-to-zero constraint for now. It is not strictly necessary in a Bayesian context.