Predicting with a negative binomial likelihood

I am attempting to fit a relatively simple negative binomial model to estimate deaths by month, and predict out into the future for excess mortality estimates. But, I am struggling with the specifying the code for the prediction portion. The code is further complicated by uncertainty in the log offset term; however, the model ran successfully with its current specification before I added the prediction.

In statistical notation, the model can be written as:

y_{st[m]} \sim \mbox{NegBinom}(\eta_{st[m]}, \phi)
\eta_{st[m]} = \log(N_{st[m]}) + \alpha_s + \gamma_{st} + \theta_{sm} + \delta_{st[m]}

where s represents sex, t represents year, and m represents month. The outcome y is monthly deaths by sex, \eta is the linear predictor, N is the monthly population size accounting for migration, \alpha are sex-specific intercepts, \gamma are sex-specific RW2s on year that share a precision term, \theta are sex-specific cyclical RW2s on month that share a precision term, and \delta are sex-specific year-month IID terms to be estimated on observed data but not included in prediction. Note, prediction involves the estimation of new RW2 terms for the yearly RW2.

I have tried doing prediction in both the model block and the generated quantities block, but am running into issues in both cases.

I have attached both .stan scripts and my data, but I will summarize the two approaches here. To include it in the model block, I specified a bunch of additional parameters and transformed parameters including

parameters{
  ...
  vector[N_pred] deaths_pred;
}

transformed parameters{
  ...
  vector[N_pred] eta_pred;
}

model{
  ...
  // Likelihood
  for(i in 1:N){
    deaths[i] ~ neg_binomial_2_log(eta[i], phi);
  }
  
  // Prediction
  for(i in 1:N_pred){
    deaths_pred[i] ~ neg_binomial_2_log(eta_pred[i], phi);
  }
}

This approach gives me the error that neg_binomial_2_log expects an integer on the RHS, but then I run into the issue that parameters cannot be integers.

When I tried to switch approaches and use the generated quantities block, the most relevant-to-the-conversation changes were:

transformed parameters{
  ...
  vector[N_pred] eta_pred;
}
generated quantities{
  // Prediction
    matrix[n_draws,N_pred] deaths_pred;
  // simulate from priors
    for(k in 1:n_draws){
      for(j in 1:N_pred){
        deaths_pred[k, j] ~ neg_binomial_2_log_rng(eta_pred[j], phi);
      }
    }
}

This approach produces the error that “Target can only be accessed in the model block or in definitions of functions with the suffix _lp.”

Can anyone point me in the right direction for coding up this prediction portion of my model?

Thank you!
sex_model_genquant.stan (14.8 KB)
sex_model_param.stan (15.0 KB)

1 Like

This needs to be done in generated quanties, but should be done with an assignment statement (=) rather than a distribution statement (~).

1 Like