Zero-inflated beta regresson

ecorenhb · May 6, 2020, 12:38am

Hi there, I am using zero-inflated beta regression to model proportional data, but the Q-Q plot did not follow the standard uniform distribution (using the function ‘ppc_loo_pit_qq’). The q-q plot see the following figure. What should I do to improve the model? I wonder if the trouble may come from the prediction of zero-inflation probablity, say, not involving all the predictors (I am not sure).

Another question is how to calculate studentized residuals for brm-fitted object, since I do not know how to calculate the hatvalue like simple linear regression. Many thanks!

martinmodrak · May 12, 2020, 2:45pm

Hi, could you please share more details about your model, dataset and the real-world question you are trying to answer. Without those details it unfortunately impossible to say anything more than that there really is probably something wrong with your model.

ecorenhb · May 16, 2020, 5:23pm

Many thanks! using many plot data, I tried to study how similarity of pairwise species (obs.bz) correlated with plot-level species richness (scal.logrich), plot topographic complexity (scal.elevdiff), species abundance (scal.abd and scal.abdpair). ‘scal.’ means standardizing the variable, ‘log’ means log-transforming the relevant variables.

The response ‘obs.bz’ is subject to [0,1) with many values of zero, so I choose zero-inflated beta model to fit the data.

The brm codes I used are:

lfm <- bf(obs.bz ~ poly(scal.abd,3)+poly(scal.abdpair,3)+scal.logrich+scal.elevdiff+poly(scal.abd,3):scal.logrich+
          poly(scal.abd,2):poly(scal.abdpair,2)+(poly(scal.abd,3)+poly(scal.abdpair,3)||plotcode),
          zi ~ poly(scal.abd,1)+poly(scal.abdpair,1)+scal.logrich+scal.elevdiff+poly(scal.abd,1):scal.logrich+poly(scal.abdpair,1):scal.logrich+
          poly(scal.abd,1):poly(scal.abdpair,1)+(poly(scal.abd,1)+poly(scal.abdpair,1)||plotcode),
          phi ~ poly(scal.abd,1)+poly(scal.abdpair,1)+scal.logrich+scal.elevdiff+poly(scal.abd,3):scal.logrich+poly(scal.abdpair,1):scal.logrich+
          poly(scal.abd,2):poly(scal.abdpair,2)+(poly(scal.abd,1)+poly(scal.abdpair,1)||plotcode))

lprior <- c(prior(normal(0,100),class="b"),
             prior(normal(0,100),class="Intercept"),
             prior(normal(0,100),class="sd"))

bayesfit_zb_pair <- brm(formula=lfm, prior=lprior, data=adat, control = list(adapt_delta = 0.95), 
                          family=zero_inflated_beta(link="probit"),cores=no_cores)

the result:

I guess using GAM with brm could help. look forward to commens. Thanks.

ecorenhb · May 20, 2020, 11:16pm

Below is the residual vs fitted value for the model I described above. @martinmodrak and friends, I beg your help, which are highly appreciated.

martinmodrak · May 22, 2020, 11:15am

OK, so maybe I misunderstood the plot better. The doc for ppc_loo_pit_qq says:

Comparing to the uniform is not good for extreme probabilities close to 0 and 1, … However, in most cases we have found that the overlaid density plot ( ppc_loo_pit_overlay() ) function will provided a clearer picture of calibration problems than the Q-Q plot.

So it might not be a problem at all, would you try ppc_loo_pit_overlay? Also, you should try additional PP checks, especially against some subgroups of the data, e.g. ppc_violin_grouped grouped by plotcode or even better, grouped by a variable not included in your model. Looking at ppc_dens might also be worthwhile and ppc_stat or ppc_stat_grouped looking at some aspects you don’t model (like maximum or 95% quantile) could alse be of interest. All of those should help you either pinpoint a more specific problem (e.g. “group X is not well modelled” or “the tail quantiles are not well modelled”).

Best of luck!

ecorenhb · May 22, 2020, 11:44am

Thanks a lot! @martinmodrak. The overlaid density plot appears more ugly. But pp_check looks good. I will try other checks you suggested.

“So it might not be a problem at all” you mentioned, means the model is OK? I am not confident to say that since the residuals did not follow the distribution the model assumed. This also can be found from the plot of residuals vs fitted values above.

In addition, the plot of residuals vs. fitted value above show a outlier. Is there any approach to handle it?

Your kind help is highly appreciated!

martinmodrak · July 27, 2020, 8:53pm

Sorry, I completely forgot to get back to you, did you manage to resolve the issue in the meantime?

Topic		Replies	Views
Improving model fit with zero_one_inflated_beta with a specific case study Modeling brms	4	688	August 24, 2022
Help predicting from zero-inflated binomial brms techniques , specification	11	1711	December 6, 2021
Why is my zero-one-inflated beta regression showing good posterior distributions but very bad fitted values and predictions? Modeling fitting-issues , performance , brms	2	442	June 7, 2024
Issue with group-level effects using 0-1-inflated beta family brms	3	1187	July 20, 2018
Zero-inflated Bernoulli brms	3	219	June 6, 2024

Zero-inflated beta regresson

Related topics