LOO - uncertainty in pareto-k estimates

I’ve a question concerning uncertainty in the pareto-k values that are obtained from carrying out cross validation in R via brms::loo.

I was running a model where I found that the pareto-k estimates altered quite a lot on re-running the model and started wondering what sort of uncertainty you might expect in these values. I created a simulation to provide an example:

#---------------------------------------------------------------------
library(brms)
library(tidyverse)

#random normal vector with a single 'odd' point added
set.seed(16784)
d1 <- data.frame(test = c(rnorm(10),4))
mx <- brm(test~1, data = d1, prior = prior(normal(0,3), class = 'Intercept'))

res_mat <- matrix(NA, ncol = 11, nrow = 100)

for(i in 1:100){
  print(i)#I want to know where I've got to
  mx2 <- update(mx, newdata = d1, refresh = 0, silent = 0)
  res_mat[i,] <- loo(mx2)$diagnostics$pareto_k
}

resdf <- res_mat %>%
  data.frame() %>%
  pivot_longer(1:11, names_to = 'obs_no')

p1 <- ggplot(resdf, aes(x = value, colour = obs_no))+
  geom_density() +
  labs(x = 'pareto_k') +
  scale_colour_viridis_d()
p1

#----------------------------------------------------------------------

This is re-modelling exactly the same data each time. The 5-95% quantiles for the pareto-k estimate for the ‘odd’ point in the data from this seed are 0.7 - 1.07. There are similarly wide distributions of pareto-k values for the other data points.

My concern is that, with this level of variability, it could be quite difficult to interpret the value against the cutoffs suggested - particularly if there are no warnings, for example with a pareto-k value of 0.6.

What sort or variability might be expected for these values? Is this a concern?

Thanks for any help in this (and pointing me in the right direction if I’ve misunderstood ‘loo’)… :)

tagged: @avehtari

3 Likes

There’s an example how to compute the marginal posterior for k in my post in the thread Stochasticity in pareto-k diagnostics over different fitting runs - #5 by avehtari. Form that marginal you can infer the expected amount of variability. You can reduce the variability by running longer chains, or you can make importance weights better behaving (and thus getting lower k’s) with moment matching loo Avoiding model refits in leave-one-out cross-validation with moment matching • loo

1 Like

Great - thanks. That makes sense - I think I was misunderstanding moment matching to some degree.

1 Like