Zero and one inflated beta regression - posterior predictive checks by component (π1, π0, μ)

fusaroli · May 7, 2021, 12:37pm

I’ve built a zero and one inflated beta regression of linguistic alignment data (on a 0-1 scale, from no word re-use to exact repetition).
Zero inflation is π0, one inflation is π1, and the mean of the beta is μ.

Conceptually, I am interested in

exact repetitions between interlocutors (π1),
alignment rate (1- π0), or propensity to reuse the other person’s words
alignment level (μ), or amount of words reuse when there is indeed alignment

I’m trying to generate meaningful plots of the posterior estimates and how well they describe the data. I used pp-check and it looks good, but I’d like a more granular perspective on the components of the model.

I tried

PredsZOI_LA <- posterior_predict(model, dpar="zoi")[1,]
PredsCOI_LA <- posterior_predict(model, dpar="coi")[1,]
LexicalAlignment <- posterior_predict(model, dpar="mu")[1,]
### Unconditioning the inflation
LexicalRate <- 1 - (PredsZOI_LA * (1-PredsCOI_LA))
LexicalRepetitions <- PredsCOI_LA * PredsZOI_LA

But it gives me weird results (bad fit to the data)

However, predictive checks (via pp_check) and the model estimates show good fit. For instance, if I manually extract the coefficients of the model and generate predictions (both for population level mean, and for the group level distribution), I get a much nicer fit:

So, I must be misunderstanding the way posterior_predict works. Any suggestions?

martinmodrak · May 13, 2021, 9:45am

Hi,
a bit short on time, so jus a quick note:

fusaroli:

PredsZOI_LA <- posterior_predict(model, dpar="zoi")[1,]
PredsCOI_LA <- posterior_predict(model, dpar="coi")[1,]
LexicalAlignment <- posterior_predict(model, dpar="mu")[1,]
### Unconditioning the inflation
LexicalRate <- 1 - (PredsZOI_LA * (1-PredsCOI_LA))
LexicalRepetitions <- PredsCOI_LA * PredsZOI_LA

I don’t think posterior_predict accepts a dpar parameter. If you want predictions for individual parameters, I think (can’t check now) that you need posterior_linpred or posterior_epred (which differ primarily in whether the link function is applied).

Also indexing ([1,]) looks suspicious. AFAIK posterior_predict will give you a matrix with the dimension number of samples x rows of dataset x. So if you are trying to predict only for a single participant I think you need [,1]? Generally, what you should usualy be doing is computing the quantity of interest separately for each sample which then gives you samples representing the posterior distribution of the quantity of interest.

Best of luck!

fusaroli · May 13, 2021, 3:19pm

somehow my code had drifted from posterior_epred to posterior_predict(). Probably I was trying to better include the variance (phi). I now shifted it back and it solved the issue. Thanks.
( the [1, ] was just to simplify the processing and memory usage, lookng only at one sample, I use n samples in the actual plot).

Topic		Replies	Views
Assessing Bayesian Beta Regression fit using pp_check Modeling fitting-issues , brms	5	88	April 8, 2025
Improving model fit with zero_one_inflated_beta with a specific case study Modeling brms	4	696	August 24, 2022
Discrepancy between y and yrep in pp_check: how bad is it? Modeling posterior-predictive , brms	25	2986	May 11, 2022
Why is my zero-one-inflated beta regression showing good posterior distributions but very bad fitted values and predictions? Modeling fitting-issues , performance , brms	2	473	June 7, 2024
Posterior predictive check (RNG) for Zero-One Inflated Beta Modeling specification	2	371	July 25, 2023

Zero and one inflated beta regression - posterior predictive checks by component (π1, π0, μ)

Related topics