I want to fit a model involving a binomial-Poisson hiearchical distribution with two sets of conditionally independent binomial counts:

y_i \sim \text{Poisson}(\lambda_i),

z_{1, i} \sim \text{binomial}(y_i, \kappa),

z_{2, i} \sim \text{binomial}(y_i, \kappa),

where z_{1, i} and z_{2,i} are conditionally independent given y_i, y_i is unobserved, and the \lambda_i come from a regression with a log link (and Gaussian overdispersion).

If I had only a single z_i, it would be easy to marginalize: z_i \sim \text{Poisson}(\kappa \lambda_i).

Approaching marginalizing with two sets of counts in the same way, and with a bit of help from SageMath, I found that the joint marginal density of (z_{1,i}, z_{2, i}) is

P(Z_{1, i} = z_{1, i}, Z_{2, i} = z_{2, i}) = \kappa^{z_{1,i} + z_{2, i}}e^{-\lambda_i}(1 - \kappa)^{-(z_{1, i} + z_{2, i})}a^{z_{1, i}}\binom{z_{1, i}}{z_{2, i}}\frac{1}{z_{1, i}!} {_1F_1}(z_{1, i} + 1, z_{1, i} - z_{2, i} + 1, a),

where a = (1 - \kappa)^2 \lambda_i, I assumed without loss of generality that z_{1, i} \geq z_{2, i}, and _1F_1() is a generalized hypergeometric function.

Can I implement this easily in Stan? There might be some obvious simplification that I havenâ€™t recognized. If not, I havenâ€™t found a generalized hypergeometric function in the documentation, but it doesnâ€™t look too difficult to write a Stan function based on genhypergeo() in the R package hypergeo (although I can imagine there is potential for numerical problems).

Iâ€™d be quite happy to have y_i as a latent variable instead, but I couldnâ€™t immediately see how to do this in Stan: it canâ€™t be a parameter because itâ€™s discrete. I donâ€™t need to be able to identify \kappa and \lambda_i separately, although looking at the formula suggests that this will be possible.