Regarding finding outliers with pareto k values > 0.7, Max Farrell’s tutorial says:
One or two moderate outliers (k>0.5) shouldn’t have too much of an effect on the model. But, rather than ignoring this, one option is to re-fit the model without these problematic observations, and directly calculate the loo statistic directly for them.
loo1 <- loo(stan_glm1, k_threshold=0.7)
loo2 <- loo(stan_glm2, k_threshold=0.7)
We can then compare models:
loo(stan_glm1, k_threshold=0.7) doing? It also looks like
k_threshold is no longer an argument for loo. Is this performing loo-cv but without the outlier values? Why is that justified?