I am trying to reproduce a stochastic epidemic model from Nature Communications paper (code: GitHub - roidtman/NatComm_dengue_China ), and I would appreciate some advice on whether my current Stan implementation is statistically and computationally reasonable.
Model structure in the original paper
In the paper, the authors specify the model in two stages:
- Latent stochastic transmission process
At each time ttt, a latent number of new infections λt\lambda_tλt is generated as

where $\beta_t$ is a time-varying transmission rate, and \phi is an overdispersion parameter.
- Observation model
The reported cases are then modeled using a beta–binomial likelihood:
![]()
where the beta parameters are functions of the latent infection count \lambda_t.
Thus, the negative binomial component represents process noise, while the beta–binomial distribution is the actual observation likelihood.
Problem in Stan
As discussed in multiple threads (e.g. ( Discrete parameter in Stan - #8 by ghjk )( Initialization Error in Stochastic Epidemic SEIR Model - #17 by maxbiostat )):
- Stan does not support discrete latent parameters.
- RNG functions cannot be used in the
modelblock.
Therefore, it is not possible to directly represent the latent negative-binomial stochastic process in Stan.
My current approach: continuous approximation
Following suggestions from the Stan forum, I approximate the latent stochastic process using a Gamma–Poisson mixture representation:
In Stan, this is implemented as:
parameters {
vector<lower=0>[N] para;
}
model {
for (t in (lag_max + 1):N) {
para[t] ~ gamma(infectious_pressure[t], infectious_pressure[t] / mu[t]);
real alpha = 1 + para[t];
real beta_param = 1 + pop - para[t];
target += beta_binomial_lpmf(cases[t] | pop, alpha, beta_param);
}
}
My question
Conceptually, does this formulation make sense as a continuous approximation to a discrete stochastic epidemic process?
In particular:
- Is it reasonable to treat the latent negative-binomial process as a Gamma-distributed continuous variable when the observation model is beta–binomial?
- Would marginalizing out the latent process (if possible) be preferable to this hierarchical approximation?
Any feedback on whether this modeling strategy is principled—or suggestions for better alternatives within Stan’s constraints—would be greatly appreciated.
Thank you very much for your time.