Marginalize over denominators in mixture of binomials

Thx @maxbiostat for tagging me. First note the thread at Sum of binomials when only the sum is observed where I discussed a similar topic. It turns out it is important how you choose N_1 and I think that the process you describe is actually identical to this one:

For each element:

  1. Flip a biased coin (with prob \tau) independently
  2. If the coin was heads, choose \hat{\theta} = \theta_1 else choose \hat{\theta} = \theta_2
  3. Flip a biased coin with prob \hat{\theta}, if it is heads, add one to X

If that is so, then X \sim \text{Binomial}(N, \tau \theta_1 + (1-\tau) \theta_2) and you cannot infer any information about \theta_1, \theta_2, \tau individually.

If the process is actually different, the saddlepoint approximation mentioned in the paper linked by @maxbiostat might be sensible.

I wrote about implementing a saddlepoint approximation for sum of negative binomials here: https://www.martinmodrak.cz/2019/06/20/approximate-densities-for-sums-of-variables-negative-binomials-and-saddlepoint/ which discusses all the nuts and bolts to get it running in Stan.

For negative binomials the approximation was not very useful as there are simpler approximations that still work good. Binomials are however different (see the thread I linked earlier). Saddlepoint is however very slow to compute.

Best of luck with your model!

3 Likes