Inconsistency in zero-inflated Poisson models?

Please also provide the following information in addition to your question:

  • Operating System: Windows 10
  • brms Version: 2.9.0

Hi all,

I’m pretty new to Bayesian stuff and this is my first time ever posting a question here, so I apologize in advance if the answer is obvious or I do anything dumb.

I’m trying to get comfortable with zero-inflated Poisson models in brms, and I’m working through the example in Statistical_rethinking_recoded. What’s confusing me is why the following two models aren’t yielding exactly the same results:

prob_drink <- 0.2 # 20% of days drinking -> 0 manuscripts
rate_work <- 1 # average 1 manuscript per day (lambda)

#Sample one year of production
n <- 365
drink <- rbinom(n, 1, prob_drink)
y <- (1 - drink) * rpois(n, rate_work) #number of manuscripts completed

d <-
 tibble(Y = y) %>%
 arrange(Y) %>% 
 mutate(zeros = c(rep("zeros_drink", times = sum(drink)),
               rep("zeros_work",  times = sum(y == 0 & drink == 0)),
               rep("nope",        times = n - sum(y == 0))

b11.4a <-
  brm(data = d, family = zero_inflated_poisson,
  Y ~ 1,
  prior = c(prior(normal(0, 10), class = Intercept),
  prior(beta(2, 2), class = zi)), # the brms default is beta(1, 1)
  cores = 4,
  seed = 11,
  inits = "random")

b11.4_b <-
  brm(data = d, family = zero_inflated_poisson(link_zi = "identity"),
  bf(Y ~ 1, zi ~ 1),
  prior = c(prior(normal(0, 10), class = Intercept),
  prior(beta(2, 2), class = Intercept, dpar = zi)), # the brms default is beta(1, 1)
  iter = 2000, warmup = 1000, chains = 4,
  cores = 4,
  seed = 11)

The results are very similar, but not identical (particularly the n_eff). Whereas with the seed, if I run b11.4a over and over again the results are exactly the same. I compared the stan code produced by both these models and do see some differences, but I’m not skilled enough to know exactly what I’m looking at.

Regardless of why they’re different, would model b11.4_b be a good basis for future zero-inflated Poisson models?

Thank you!

What is missing in b11.4_b is that you tell Stan that zi is between 0 and 1. Yes, you specify a prior which only has mass between 0 and 1, but the sampler may still try to go outside and when it does produce divergent transitions. Thus, if you really want to predict zi on the “identity” scale you need to make sure that the complex predictor of it is always in [0, 1], which is very difficult without actually using a proper link function such as logit.

Oh, interesting! Thanks!

So in that case (sticking with the logit link for zi) a beta prior would be inappropriate and I should choose another one, such as the default logsitic in brms, right?

The beta prior is appropriate, but the model still things that your intercept can be anywhere between -Inf and Inf, although of course this is not the case due to the beta prior. You can see this when extracting the Stan code via stancode(b11.4a). Specficially, temp_Intercept has no upper or lower boundary and brms does not support setting boundaries on this parameter, because in the brms framework their is bascially no need for it.

If you really want to do it for some reason, you may use

formula = Y ~ 0 + Intercept
prior = prior(beta(2, 2), class = b, dpar = zi)
1 Like