Why does the trend derived from ZOIB-model Brms not overlay manual calculated trend?

Hello,

I never placed a question here, so first time questioner. I have performed a Zero-One-Inflated-Beta regression with logit link in R. The conditional effects can be extracted which is useful for plotting given it returns the intervals. Yet, when I calculate backwards from the values returned to the expected value (ZOIB and B [without ZOI]) and compare this with the expected values the Brms function conditional_effects returns, the trend lines do not match. Also, these manual derived values match much better with the data, I’ve been scratching my head for a few days now. And, if I am correct then the expected value should be:

E(y|x)=exp(β+β1x1+Z1+C1)/(1+exp(β0+β1x1+Z1+C1))

The question is why don’t the expected values match, where do I go awry?

Thank you in advance


P.S. Duplicated question from https://www.researchgate.net/post/Why_does_the_trend_derived_from_ZOIB-model_Brms_not_overlay_manual_calculated_trend

P.P.S. Visualizing conditional effects plot for e.g. zi parameter - #2 by paul.buerkner

brack_frac_mod.RData (2.7 MB)
brack_frac.RData (9.7 KB)

1 Like

Could you be more specific about what are you doing exactly here? I am not familiar with the brms output format, but does this mean you are using the expected values of individual parameters to compute the trend you describe? (If yes, this will usually be a a meaningless output, since the mean values for all parameters are most likely not a sample in the chain and may have very low likelihood).

How did you assess what a “better match” is, did you compute the log posterior for the two outputs? While that would be the objective way of comparing the outputs for different sets of parameters, it’s also worth noting that the mean or median expected value don’t need to have higher likelihood than all other samples (though the MAP does).

So in principle there need not be something going awry for you to observe what you describe.

Dear @caesoma,

I am sorry my question is unclear. I assessed the fit through the data by eyeballing the scatterplot. I know there are less “subjective” ways, but I do not consider this of the utmost relevance in this case, considering using a single DV. Additionally, if I use the posterior draws to construct predictions (image hypothetical outcome plots [red]). The expected value and its variability lie more centered through the points. Fitting a quasibinomial shows a “better” fit [blue]. The question, why does conditional_effects()[green] not follow the manual derived values [red]. Should this not follow:
E(Y|X)=exp(β+β1X1+Z1+C1)/(1+exp(β0+β1X1+Z1+C1))

I think there are two issues here (note that I don’t have much experience with ZOIB models, but I think my assumptions are plausible):

First, conditional_effects() with dpar = NULL evaluates to zoi * coi + mu * (1 - zoi) in brms:::posterior_epred_zero_one_inflated_beta():

E(y|x) = \alpha\gamma + \mu (1-\alpha)

This suggests that your formula for y1 is not correct and this explains the discrepancy.

What you manually compute (with y2) is mu. Your y2 and y3 have (nearly) identical values when you use conditional_effects() with dpar = "mu":

y3 <- brms::conditional_effects(brack_frac_mod, dpar = "mu")
y2 <- exp(-5.19+0.66*x)/(1+exp(-5.19+0.66*x))

plot(brack_frac$value, brack_frac$Brackish, xlab="x", ylab="y", pch=19,xlim=c(2,10))
lines(x, y1, col="blue", lwd=2)
lines(y3$value$value, y3$value$estimate__, col="blue", lwd=2)

This means that conditional_effects() with dpar = NULL will not give only the Beta regression part, but will consider also the estimated probability for zero and one values for the response variable. I tihnk the fact why this does not produce a good fit with the data is related to the second issue:

Second, your ZOIB model assumes a constant probability for zero and non-zero values across all values for variable value (the predictor). However, your plot shows that this is not the case (i.e. there are much more values brackish=1 for high values of value and conversely more values brackish=0 for high values of value).

This is why conditional_effects() with dpar = NULL will produce much lower predicted values for brackish (y) for high values of value (x): The model assumes that there is a constant fraction of zero values. The same is true for one values.

Since there are much more zero values in the data than one values, this is visible mainly as an underestimation of brackish for large values of value. However, (as your plot show), the model also overestimates brackish slightly for low values of value.

I think that the confusion gets resolved when you model the zoi and coi parameters appropriately (e.g. with variable value).

2 Likes

As I mentioned I’m not a brms user, so it’s not clear to me what things like conditional_effects() are supposed to do. As also not an R user, I cannot really see what predictors are included. In any case, I believe the fit of the non-inflated (blue) binomial model is besides the point here, and the focus is really on obtaining the same result regardless of the tool to compute it.

@henningte seems to have a better assessment of the actual code and potential issues when coding what your expected.

@henningte Thank you! This makes sense, very well spotted! Now only adjusting for the zoi and coi parameters to improve the fit. Although, you have not much experience with ZOIB models, and this might be a long shot, do you have any sources for ZOIB models at the level of “Ordinal Regression Models in Psychology: A Tutorial” (Bürkner & Vuorre, 2019)?

Although, you have not much experience with ZOIB models, and this might be a long shot, do you have any sources for ZOIB models at the level of “Ordinal Regression Models in Psychology: A Tutorial” (Bürkner & Vuorre, 2019)?

@snwikaij Sorry, I don’t have an overview on what good tutorials would be.

However, I did a quick search and found this preprint (Bendixen et al. 2022: " Cognitive and cultural models in psychological science: A tutorial on modeling free-list data as a dependent variable in Bayesian regression") which shows how to compute ZOIB models with ‘brms’. It’s not my discipline and I have never worked with the tutorial, but on the first glance it looked quite promising. So it’s just a suggestion.

1 Like