Any suggestions for improving model fit (zero-inflated highly positively skewed Y-variable)?

I have a zero-inflated Y variable.


And tried a hurdle_lognormal modelling with brms, and this was the result of pp_check()


My limited understanding says that this can not be considered as acceptable? Any suggestions for proceeding with the analysis?

It looks like there’s at least three distinct lumps in the output. Is there anything in the process you’re measuring that could be driving that?

Is there any way that could be included as a covariate?

Thank you so much! I tried few options, seems that variable binning and multinomial regression would be the only solution.

Fair, just keep in mind that if you have a complicated output, that could be driven by a complicated process + simple output distribution (it isn’t necessarily best modeled with a complicated output distribution, if that makes sense).

1 Like