Pareto.pdf (5.6 KB) Short summary of the problem
How to remove observations with a pareto_k > 0.7
If possible, add also code to simulate data or attach a (subset of) the dataset you work with.
Please also provide the following information in addition to your question:
- Operating System: macOS
- brms Version:
• 2.13.0
I used an approximate leave-one-out cross-validation to validate a model and got this warning message.
“Found 3 observations with a pareto_k > 0.7 in model ‘SEM_brms’. It is recommended to set ‘reloo = TRUE’ to calculate the ELPD without the assumption that these observations are negligible. This will refit the model 3 times to compute the ELPDs for the problematic observations directly.”
I then tried to find out the observations that have pareto_k > 0.7. I plotted the loo plot with label equal=TRUE: plot (Criteria_pop$criteria$loo, label_points = TRUE) and plotted the figure attached above (Pareto.pdf). From that figure, it is clear that the observations 23, 34, and 46 are “influential” data points.
My questions is the following.
Is there an automatic way to delete these data? By removing these “influential” data points, I expect to improve the model and estimate the new posterior and see if they differ from the first one, I got with the “influential” data points.
Thanks in advance