# Max Farrell's tutorial says refit model without outliers k>0.7

Regarding finding outliers with pareto k values > 0.7, Max Farrell’s tutorial says:

One or two moderate outliers (k>0.5) shouldn’t have too much of an effect on the model. But, rather than ignoring this, one option is to re-fit the model without these problematic observations, and directly calculate the loo statistic directly for them.
`loo1 <- loo(stan_glm1, k_threshold=0.7)`
` loo2 <- loo(stan_glm2, k_threshold=0.7)`
We can then compare models:
`compare(loo1, loo2)`

What is ` loo(stan_glm1, k_threshold=0.7)` doing? It also looks like `k_threshold` is no longer an argument for loo. Is this performing loo-cv but without the outlier values? Why is that justified?

1 Like

Hi and welcome to Discourse!

The tutorial you linked is using the `loo` function from R package `rstanarm`, not from R package `loo`. `rstanarm` provides a nice vignette for how it implements `loo` here: Information criteria and cross-validation — loo.stanreg • rstanarm

Briefly, what’s happening here is the following:

• loo is an approximation to leave-one-out cross validation
• For observations with high pareto k, the approximation might fail
• One solution is to actually do the leave-one-out procedure for observations with bad pareto k. This is what `k_threshold` is doing. If it finds an observation with bad pareto k, then instead of relying on PSIS-LOO to estimate the predictive performance if that observation is left out, it actually refits the model with just that observation left out, thus doing explicit leave-one-out cross-validation for these observations that are not well handled by the PSIS-LOO approximation. It repeats this for every observation with bad pareto k.
5 Likes

Thank you! That makes sense. Is this method suitable for and/or easily implementable for more complicated models with e.g. the package `loo`?

1 Like

The procedure is certainly valid for other models (any time that leave-one-out cross-validation is valid, this procedure should be valid). However, the procedure is largely superseded by the new implementations of `loo` that include corrections based on moment matching. See here for details
https://cran.r-project.org/web/packages/loo/vignettes/loo2-moment-matching.html

If moment matching is not a good option for you (e.g. if you’ve fit the model in pure Stan using complex constraint transforms that make wrangling the unconstrained draws prohibitive), then it shouldn’t be overly difficult to implement the procedure that `rstanarm` uses, I think. If you need to go this route, feel free to ping us again on this thread for pointers.

3 Likes

Thanks a lot. However, I can’t get moment matching to work. When I try `loo2 <- loo(fit, moment_match = TRUE)`, I get the error:

``````Error in .local(object, ...) :
the model object is not created or not valid
``````

In the vignette, to get their model fit object, they use the function `sampling` whereas I use `stan`, however those links say that both functions should produce an object of S4 class `stanfit`, and the `loo` function says it takes in an object of S4 class `stanfit`, as does the function `loo_moment_match` but that also gives me the same error.

Update: I have tried refitting the model due to finding this past question Loo moment matching not working - #10 by sergej but now R aborts as soon as I try `loo2 <- loo(fit, moment_match = TRUE)`. This particular model is extremely quick to fit (4 chains of 2000 run in less than a minute), so I don’t see why this would break?

Which package’s “loo” are you using?

I have tried both `rstan::loo` and `loo::loo`. I think it appears to work on my laptop now (yet to retry on my desktop which is where it broke previously) by re-running the model fit and including the argument `cores=1` - bit annoying that it doesn’t work on model fits I have already run and saved, but hey ho. So after moment matching, now all `k < 0.7`, so all elpd estimates are probably okay. What does this mean for the other interpretation of `k` to judge observations that are may be very influential to the posterior? Obviously doing moment matching doesn’t change the model fit, so I assume you still use the original `k` values from `loo` to say which observations may bias the data?

Yup. If you are using `k` to diagnose influential observations, then you would still use the originals. Note that

is not precise language in this context. I can’t tell if I’m being obnoxiously pedantic or usefully clarifying by pointing this out. These observations with high `k` probably have outsized influence on the model posterior, but that doesn’t mean that the data are biased due to their inclusion. In some scenarios, high `k` can also be suggestive of model misspecification. For more, see

2 Likes

Thanks very much!

1 Like