Dirichlet-Gamma-Poisson mixture distribution, or something like that?

I fear vagueness could cost us a lot of wasted effort here. What do you mean by “evalute \hat K”? Because being able to draw from distribution of \hat K given all the other parameters is IMHO much easier than evaluating the log-mass of an observed \hat K value given the other parameters. Do you really need both? In other words: do you need to use \hat K as a source of information? Or is it simply a derived characteristic you would like to compute after extracting information from the K_i? Or do you just want to do cross-validation against a held-out part of the dataset? Or is there something else you need?

I don’t think I follow what you are trying to accomplish here… My reasoning was really very simple (but maybe I didn’t explain it very well): since we end up building the neg.binomial by matching the mean and variance of the sum, we might as well just work with mean and co-variance all the way.

So assuming \phi = 0 for simplicity, we get

\rm{E}(\hat K) = \rm{E}(\sum \nu_i) = \rm{E}(\sum \mu x_i) = \mu \\ \rm{Var}(\hat K) = \sum \rm{Var}(K_i) + 2 \sum_{1 < i < j < Q} \rm{Cov}(K_i, K_j) = \\ = \sum \mu x_i + \sum \frac{\mu^2 x_i^2}{\alpha} + 2 \sum_{1 < i < j < Q} \rm{Cov}(K_i, K_j) \\

The covariance terms are tricky and I don’t see any immediate way to compute them but if Q is large, the covariance terms will be small-ish. We could also probably approximate \rm{Cov}(K_i, K_j) \approx \rm{Cov}(\nu_i, \nu_j) = \rm{Cov}(\mu x_i, \mu x_j) = \mu^2 \rm{Cov}(x_i, x_j) = \mu^2 \frac{-a^2}{(Qa)^2(Qa + 1)}. I hope the generalization to other values of \phi doesn’t present any additional big challenges.

Assuming \hat K is roughly NB, the mean and variance are then enough to fully describe the distribution. Obviously a simulation study should be run to check the approximation is OK-ish.

But I note once again, that this is only needed if we need to evalute the mass of specific, observed \hat K given the parameters.

I honestly also didn’t completely understand the logic of this one, but I think I can follow the rough directions. (Note: I am not so good at math, despite possibly looking confident). I think trying to find the moments is sensible. Anyway, I can recommend Wolfram alpha (or Wolfram Cloud for more advanced uses) for solving integrals. Even the free versions can definitely solve integrals I am completely unable to crack :-). I can however see how you could (if Q is large-ish) assume that Q is roughly gamma distributed and treat it as additional parameter in the model…

But the point of assuming the exponential distribution was that you can simulate \hat K under this assumption very easily and you can easily do cross-validation against a held-out set of K_i values. However, I don’t think it lets you compute the mass for an observed \hat K in any easy way.

EDIT: I’ve been mixing density and mass in the post, so tried to fix that.

Hope I am making sense - maybe my proposals don’t work and I am overseeing some basic problem, so please treat my suggestions with some skepticism.