A quick note what I infer from p_loo and Pareto k values

It means that the observation is highly influential and leave-one-out-posterior when leaving that observation out is so much different than full posterior that importance sampling approximated LOO fails (and waic would fail, too). See Pareto smoothed importance sampling for more. Observation can be highly influential for different reasons, see LOO Glossary Pareto k estimates.

With khat >0.7 the error can be negligible or very large with some probability, but as you can’t know how large the error can be or what is that probability what to do depends on how are you going to use the model. I recommend to look at those 15 observations and check if you can see why they are highly influential. This will also improve your understanding of the phenomenon, data and model.

Small proportion means that small proportion is highly influential and there can be different reasons why observations are highly influential, see the link above.

That is likely as variational inference is less accurate than MCMC

The purpose of convergence diagnostics is different from the purpose of loo(). In addition of checking divergences, it is recommended to use also other convergence diagnostics (e.g. Rhat and ESS, and also look at the MCSE for quantities of interest). AIf the convergence diagnostics don’t alert about possible issues, then loo() can be used as part of model checking, assessment of predictive performance and model comparison.

2 Likes