Trouble with wide CI for ZINB intercepts

I’m new to both Stan and brms, so please bear with me.
I’m struggling to get some certainty about intercepts (both population and group-level) for a Zero-Inflated Negative Binomial model that I plan to use.
The problem shows with deterministically generated test data, and I cannot figure out why the CI for the intercept parameters are so wide. I use an weibull(2, 1) (same as exponential(2), afaik) as prior for the variance within and between groups.

As the data is generated and perfectly balanced, I did expect more certainty in the intercepts.
I interpret the reported intercepts as:

  • “Intercept” is the population-level intercept (average across all groups)
  • r_g1[T, Intercept] is the difference for the g1 factor for group T from the population level intercept, common for all g2 groups
  • r_g1:g2[T_R, Intercept] is the additional difference for group R, from the both above intercepts

Please see the attached Rmd file for the problem statement, and a reproducible example.
minimal_example_wide_intercepts.Rmd (5.1 KB)

I suspect that because you have random effects with just two levels, the random effect values are only weakly identified from the intercept. That is, given some true set of parameters, it doesn’t impact the target density too severely to move the intercept up one and both of elements of a random effect vector down one.

Take a look at

summary(posterior_linpred(m, newdata = data.frame(y = 0, A = 0, C = 0, D = 0, g1 = "T1", g2 = "R1")))

I think you’ll find that the actual value of the linear predictor is well constrained. It’s just that there are many weakly differentiated configurations of the fixed and random intercepts that all lead to similar values in the linear predictor. These problems lessen as the number of random effect levels gets larger. For example, with two random effect levels from a standard normal, a sample mean as large as 1 would be roughly a 1.4-sigma event. With 100 levels, it’d be a 10-sigma event.

2 Likes

Thanks, @jsocolar - I’m trying to get my head around these concepts, and your explanation makes more sense to me now. I will revisit the example to see the effects in more detail.

Replying to myself, in case someone else finds this thread in the future.
I reran the simulation with 8x8 factors (so, 64 individual groups, each having 60 observations)

It does improve the situation, but there are quirks that remain

These are the R-level intercepts for the T1 group, which according to the generative model should be steadily decreasing (from the general T1-intercept value of 0.4, it should be 0.3, 0.2, 0.1, 0, and so on…)
As you see, the T1_R3 & T1_R7 intercepts do not obey the expected pattern, and there are other quirks as well…

Attached a new copy of the Rmd (also available on my github: bayesian_travails/minimal_example_wide_intercepts.Rmd at main · epkanol/bayesian_travails · GitHub)

minimal_example_wide_intercepts.Rmd (8.5 KB)

I have not tried increasing the amount of data points, or using more “informed priors” at this point, but I am content that the addition of levels did cause the intercepts to shrink a bit more. This suggests @jsocolar was correct (and I learned a great deal - many thanks!)