Zero-inflated Bernoulli

We are using the brms package to estimate the zero inflation parameter. Our data is a survey sample. We are interested in the proportion W of subjects in the sample that have experienced a medical condition that may or may not have symptoms. Unfortunately, the data is binary and only contains Yes\No answers to the question of whether a subject experienced the condition in the past 12 months. Yes answers are coded as Y=1, No answer as Y=0. We believe some of the zeros are subjects with the condition but no symptoms (hence, zero-inflation).

We do not have precise information, only a rough estimate, about the likelihood of having no symptoms. We want to use a zero-inflated Bernoulli model to estimate the prevalence W of the condition and how it depends on a set of demographic factors X_1, X_2, X_3.

Letting Z be the zero-inflation parameter, we think of the proportion of zero in the data as being Z + (1-Z)(1-W).

Our question is about the output of the brms package. In the code W is called MEDCOND. Here is the code we are using:

fit_zibin <- brm(MEDCOND |trials(1) ~ X_1 + X_2 + X_3,
                 data = df, family = zero_inflated_binomial(),
                 set_prior("beta(19, 17)", class= "zi"))

Is this correct? That is, is the reported z_i the estimated inflation parameter? Do the reported regression coefficients of the explanatory variables X_i measure their impact on the true proportion W of subjects with the condition? (The beta prior reflects data from a few previous studies).


Iā€™m not sure that a zero inflated Bernoulli model is identifiable without actually having known asymptomatic observations (e.g r - Zero Inflated Logistic Regression - Does This Exist? - Cross Validated)

Thanks. I am only interested in understanding the output of the code. Is z what I am assuming it is?

You should see a distributional parameter named zi, which is the zero-inflation probability. Note that zero_inflated_binomial uses the logit link for zi by default. You could model how zi might depend on your predictors too.
These parts of the brms vignettes give more detail (link 1, link 2) as does ?zero_inflated_binomial.

1 Like