Hi Stan folks, I coworker of mine recently reminded me of leave-one-cluster-out (LOcO) cross validation for comparing hierarchical models. Sophia Rabe-Hesketh and Dan Furr gave an interesting talk on this at StanCon Asilomar.
My recollection of their technique was that they used quadrature to produce the marginal likelihoods needed for computing the expected log predictive density.
My question is, can we approximate LOcO from the standard log-likelihood of (say) rstanarm? For each data point the importance weight would be the inverse of the joint likelihood of each point in that cluster, then pareto smoothing can be applied to the weights as in LOO. Or is this only possible for a single held out point?