Zero-inflated beta regresson

Hi there, I am using zero-inflated beta regression to model proportional data, but the Q-Q plot did not follow the standard uniform distribution (using the function ‘ppc_loo_pit_qq’). The q-q plot see the following figure. What should I do to improve the model? I wonder if the trouble may come from the prediction of zero-inflation probablity, say, not involving all the predictors (I am not sure).

Another question is how to calculate studentized residuals for brm-fitted object, since I do not know how to calculate the hatvalue like simple linear regression. Many thanks!

Hi, could you please share more details about your model, dataset and the real-world question you are trying to answer. Without those details it unfortunately impossible to say anything more than that there really is probably something wrong with your model.

1 Like

Many thanks! using many plot data, I tried to study how similarity of pairwise species ( correlated with plot-level species richness (scal.logrich), plot topographic complexity (scal.elevdiff), species abundance (scal.abd and scal.abdpair). ‘scal.’ means standardizing the variable, ‘log’ means log-transforming the relevant variables.

The response ‘’ is subject to [0,1) with many values of zero, so I choose zero-inflated beta model to fit the data.

The brm codes I used are:

lfm <- bf( ~ poly(scal.abd,3)+poly(scal.abdpair,3)+scal.logrich+scal.elevdiff+poly(scal.abd,3):scal.logrich+
          zi ~ poly(scal.abd,1)+poly(scal.abdpair,1)+scal.logrich+scal.elevdiff+poly(scal.abd,1):scal.logrich+poly(scal.abdpair,1):scal.logrich+
          phi ~ poly(scal.abd,1)+poly(scal.abdpair,1)+scal.logrich+scal.elevdiff+poly(scal.abd,3):scal.logrich+poly(scal.abdpair,1):scal.logrich+

lprior <- c(prior(normal(0,100),class="b"),

bayesfit_zb_pair <- brm(formula=lfm, prior=lprior, data=adat, control = list(adapt_delta = 0.95), 

the result:

I guess using GAM with brm could help. look forward to commens. Thanks.

Below is the residual vs fitted value for the model I described above. @martinmodrak and friends, I beg your help, which are highly appreciated.

OK, so maybe I misunderstood the plot better. The doc for ppc_loo_pit_qq says:

Comparing to the uniform is not good for extreme probabilities close to 0 and 1, … However, in most cases we have found that the overlaid density plot ( ppc_loo_pit_overlay() ) function will provided a clearer picture of calibration problems than the Q-Q plot.

So it might not be a problem at all, would you try ppc_loo_pit_overlay? Also, you should try additional PP checks, especially against some subgroups of the data, e.g. ppc_violin_grouped grouped by plotcode or even better, grouped by a variable not included in your model. Looking at ppc_dens might also be worthwhile and ppc_stat or ppc_stat_grouped looking at some aspects you don’t model (like maximum or 95% quantile) could alse be of interest. All of those should help you either pinpoint a more specific problem (e.g. “group X is not well modelled” or “the tail quantiles are not well modelled”).

Best of luck!

Thanks a lot! @martinmodrak. The overlaid density plot appears more ugly. But pp_check looks good. I will try other checks you suggested.

“So it might not be a problem at all” you mentioned, means the model is OK? I am not confident to say that since the residuals did not follow the distribution the model assumed. This also can be found from the plot of residuals vs fitted values above.

In addition, the plot of residuals vs. fitted value above show a outlier. Is there any approach to handle it?

Your kind help is highly appreciated!