I would like to model the rate of a particular event across different populations, using a Poisson model (using the log population size as an offset). My plan was to do this using brms. However, the numerator (count of the events) for many of my observations is an approximation, and the approximation gives non-integer values.
When I run brms with family="poisson", I get the following error message:
Error: Family 'poisson' requires integer responses.
I know that the support for the Poisson distribution is integers, but I always thought that the likelihood was still defined for non-integers and you could still fit the model. I think this is possible in glm, but apparently not brms.
What is the closest alternative model in brms that can allow for non-integer response values? One thing to note is that the outcome also includes a large number of zeroes; I was considering fitting a zero-inflated model or a negative binomial to account for those, depending on how the Poisson model fit. However, the negative binomial also does not allow for non-integer responses.
I suppose that one option could be to round the response, but I’d rather avoid that.
Technically it’s not since the factorial is only defined for integers, but I guess you could replace it by the gamma function to generalize the support to real numbers with \Gamma(n+1) = n!, and for it to be a proper distribution it should also be normalized. You could use this as any other custom or empirical distribution as the likelihood, at least in Stan, I’m not sure how custom distributions can be specified in brms.
I never had a strong reason to use a custom distribution, since often there isn’t a justification in the data generating process for it, or would make the distribution too complicated and it may be easier to change some part of the mathematical model. I see two other alternatives, then: rounding observations to integers, or using some other distribution.
If you have a strong reason to choose a Poisson likelihood (i.e. the generating process strongly suggests it) this would probably be the only real alternative, and the approximation resulting in nonintegers would really mess up anything else, but I understand that this may distort the observations in ways that aren’t desirable.
If a generating process-based justification is out of the picture, maybe you could pick any distribution that generally resembles the Poisson, like a normal distribution with the constraint that \sigma = \mu (although this is a little weird and won’t work well close to zero where the Poisson would start to look very different) or a Gamma distribution, you can combine the latter with a point mass at zero for a kind of Zero-inflated Gamma, for instance.
These are the alternatives I can think of from the top of my head, and generally the choice depends on what your priorities are and how acceptable the caveats of each is.
If you are willing to use custom likelihood, you can easily code up the Poisson or Neg. Binomial likelihood using gamma function and non-integers clearly work. I have struggled with this problem for some times from a theoretical perspective. My solution now is to move to Gamma distribution and with zero-inflation if needed.