Improve model with some observations pareto >0.7

I run a multilevel model with brms (file attached). I kept one main response variable in its native distributions (beta distribution), then transformed covariates to linearize and standardized covariates for two reasons. It improves model fit because it centers at zero and equalizes variance. Second, it equalizes comparisons of variance explained.

The trace plots look good, but I have a warning saying that the pareto is bigger than 0.7. I checked on this forum and people suggest changing the distribution of each response variable. For example, move from gaussian to lognormal or gamma or modify the priors. In my case since I have already scaled the covariates, I have negative values for my covariables and therefore cannot use the lognormal or the gamma distribution. I also used the standard prior.

How could I improve my model so that I don’t have this warning message​?
pareto >0.7.pdf (160.4 KB)

Thanks,

See LOO glossary Interpreting p_loo when Pareto k is large

It’s likely that some of the group specific parameter posteriors in the random effect part of the model are conditioned only on a small number of observations and the leaving out one observation changes the posterior too much for PSIS-LOO to be able to give accurate estimate (indicated by the diagnostic). How many observations are in those groups for which these influential observations belong? Can you show the full loo diagnostic? How many observations do you have? How many observations per group you have? How many parameters your model has? What happens if you run with reloo=TRUE as suggested by the warning message?

See also an example of loo behavior with a random effects model

I’ll add to CV-FAQ a separate entry for hierarchical/multilevel/mixed effect models.

1 Like