Posterior Predictives look good, but PSIS-LOO and WAIC are bad

With bad Pareto k estimates, you cannot trust PSIS-LOO or WAIC to identify the model with better predictive performance. More details about your model and the output of LOO are necessary to advise further, except to say that you might be able to get robust LOO-CV by moment matching or (if the model doesn’t take too long to fit) by actually implementing a leave-one-out cross validation procedure. The simplest way to do this depends on what interface/package you are using to fit the model.

Finally, don’t confuse lack of misspecification (i.e. capturing the true generative process), predictive performance (ability to predict new or held-out observations), and pareto k.

  • Sometimes the correctly specified model (i.e. one that captures the true generative process) can yield worse predictive performance than a simpler model. Edit: this distinction can sometimes matter a lot, depending on whether your goal is prediction or accurate uncertainty quantification around particular covariate(s).
  • Bad Pareto k doesn’t necessarily mean that the model is bad, either in the sense of poor prediction or misspecification. In some scenarios it can be indicative of misspecification, but you need to consider more information to conclude whether that is the case (see here A quick note what I infer from p_loo and Pareto k values). The only thing that high Pareto k means in all situations is that you probably shouldn’t trust the PSIS-LOO or WAIC computations to correctly determine which model yields the best predictive performance, as assessed by genuine leave-one-out cross-validation.
3 Likes