Zero-inflated poisson model for count data

jsocolar · July 30, 2023, 11:54am

If you can show your python simulation code, the answer will become clearer. You can share code by pasting it straight into discourse enclosed in triple backtics, as you’ve done for the R code and output above.

It appears that what’s happening here is that the best-fitting zero-inflated poisson has a relatively low rate of zero inflation (corresponding to roughly the inverse logit of -1.13, or 25%), and then the remainder of the data are not perfectly fit by a Poisson, because they have too few ones compared to the Poisson expectation. Heuristically, the zero-inflated poisson model is doing something like “decide how many zeros to model as zero-inflated such that the remainder of the data can be fit well by some Poisson, and then fit that Poisson” (in reality it doesn’t do these steps sequentially, but rather jointly in one step). What you’re seeing is that there’s no proportion of zeros to remove that leaves the remainder of the data looking perfectly Poisson.

Unfortunately, there’s no general-purpose prior that I can recommend on the Poisson part–my only advise is to remember that the model here is on the log scale, so make sure you are thinking in those terms about the priors (when you exponentiate wide/vague priors, things can get a bit crazy!). Also, be aware that by default, your prior on the intercept in brms will be a prior on the intercept when all predictors are held at their means, NOT when all predictors are set to zero! If you want to put a prior on the intercept when all predictors are set to zero, add 0 + Intercept to your model formulas. If you don’t intend to model covariates on the zi piece, a logistic(0,1) prior can be a good default (this is uniform on the probability scale).

Fitting the models separately and combining later means that any shared covariates (e.g. in the Poisson part) get fitted separately and do not mutually inform across the separate models. So it’s like fitting an interaction of species with all of the effects in the model. That’s not necessarily a bad thing, but it is certainly different than fitting all species jointly. For what it’s worth, it does not feel at all like data manipulation to me to notice that zero-inflation probabilities might vary by species, and if so to model that. In your case, I would consider using a random effect like zi ~ (1 | species). However, it’s hard for me to imagine a scenario where I would expect the zi part to vary strongly by species without also expecting the Poisson part to vary by species. In that case, I would try something like:
count ~ 1 + sday + (1 + sday |g1| species), zi ~ 1 + (1 |g1| species)

Here, the |g1| is an example of the |<ID>| syntax from brms and is being used to model the random effects of species as correlated between the count and the zi formulas.

Hope some of this helps!

Topic		Replies	Views
Prior predictive check for mixed zero-inflated and Poisson models Interfaces brms	2	767	September 14, 2023
Zero_inflated_poisson with brms Modeling rstan , fitting-issues , brms	4	1540	June 14, 2021
Inconsistency in zero-inflated Poisson models? brms	3	1121	June 23, 2019
Poisson model overestimating dispersion and inflating zeroes brms fitting-issues , poisson , phylogenetic	24	3826	August 26, 2020
Help interpreting zero inflated poisson model error brms fitting-issues	4	498	February 17, 2021

Zero-inflated poisson model for count data

Related topics