Diagnostics for Mixture IS leave-one-out cross-validation?

Hi all,

I’m currently trying to implement some model comparison metrics for a high-dimensional model.

I don’t have a MWE to share, but there’s a writeup of a previous implementation of the model here: Timescales of influenza A/H3N2 antibody dynamics. In short, there are a small number of continuous-valued parameters (~10) alongside a large number of discrete binary parameters (~3000), with these sampled using a Gibbs sampler. These are being fit to 12,000 or so observations across 70 individuals (antibody measurements).

Given the model takes a fair amount of time to fit, brute-force LOO-CV is not possible, so using an approximation seems to me like it would be very helpful. However, PSIS seems to not be suitable either given the poor distribution of the k parameter:

Reading through the CV-FAQ, the Mixture IS approach seems to be a good solution, and I have been able to implement it rather easily.

However, this approach doesn’t seem to come with any comparable diagnostics so I’m not sure how to assess the validity of the outputs. It seems to behave well qualitatively (e.g. will report worse scores for chains that have not converged, or models where I have removed some important component) but otherwise seems a bit opaque.

Would it be valid, e.g., to compare the ELPD estimates produced by PSIS against those produced by Mixture IS?

1 Like