I would like to forecast R_t in an epidemic (the expected number of new infectious produced per single infection at time t) and have the following (simplified) model that I would like to fit in Stan:

where I_t is the true incidence at time t and w_s is a weighting variable that accounts for the infectious period etc. of the disease which is strongly determined by prior epidemiological knowledge. We are dealing with daily data and so, for the diseases we are considering, s is typically around 50 (days).

Additionally, we don’t actually observe the true incidence at any given time; we see an observed incidence assuming a fixed observation process:

I^{obs}_t\sim \text{binomial}(I_t, p)

where p is the probability that we detect a given case which is heavily informed by another analysis.

I haven’t a clue how to marginalise out all the I_t parameters and whether this is possible in Stan.
Currently, we are using a Gibbs sampler to do the MCMC instead but wanted to ask if anyone has ideas as to how to estimate this model in Stan?

I can’t see an easy way to marginalise them out. How important is it that values of I_{t} are actually discrete? If the I_{t} values are all sufficiently large then you might be able to pretend that it is continuous, or at least evaluate the binomial density function using its continuous analog.

A few things I’m not sure about. Is the I_{t-s} on the RHS of your equation the true or observed incidence? It is the true quantity in my original equation and just wanted to check since the I doesn’t have an “obs” superscript.

Secondly, my fault for over-simplicity, p is actually a function of time. That is p=p_t. I haven’t done the marginalisation calculation myself yet and so wanted to check it works in this case.

You just need I_t as a transformed parameter. Using I_t^{observation}/p^t is a common if not great choice. If$I$ is not huge you can explicitly marginalized it. There’s a variety of solutions but they depend on the context.

Sorry, I’m being slow. Is what you are saying an approximation or the full posterior? In other words are you using the mean of unknown incidence in place of it: \mathrm{E}(I_t) = I_t^{obs} / p_t.

I_t is typically around 10-20. I can’t see how I could do nested sums over the \approx 50 I_{t-s} terms in the Poisson bracket… I suspect I’m not understanding something here…?