I’ve built a zero and one inflated beta regression of linguistic alignment data (on a 0-1 scale, from no word re-use to exact repetition).
Zero inflation is π0, one inflation is π1, and the mean of the beta is μ.
Conceptually, I am interested in
- exact repetitions between interlocutors (π1),
- alignment rate (1- π0), or propensity to reuse the other person’s words
- alignment level (μ), or amount of words reuse when there is indeed alignment
I’m trying to generate meaningful plots of the posterior estimates and how well they describe the data. I used pp-check and it looks good, but I’d like a more granular perspective on the components of the model.
PredsZOI_LA <- posterior_predict(model, dpar="zoi")[1,] PredsCOI_LA <- posterior_predict(model, dpar="coi")[1,] LexicalAlignment <- posterior_predict(model, dpar="mu")[1,] ### Unconditioning the inflation LexicalRate <- 1 - (PredsZOI_LA * (1-PredsCOI_LA)) LexicalRepetitions <- PredsCOI_LA * PredsZOI_LA
But it gives me weird results (bad fit to the data)
However, predictive checks (via pp_check) and the model estimates show good fit. For instance, if I manually extract the coefficients of the model and generate predictions (both for population level mean, and for the group level distribution), I get a much nicer fit:
So, I must be misunderstanding the way posterior_predict works. Any suggestions?