Is LOO valid for models with missing outcome when using a complete case dataset that is a subset of the original data?

I honestly don’t really know. As I said, I once tried doing the integration and it seemed to work, but I am not completely sure it is sound, I’d ask @avehtari to confirm this, if he has time.

The main idea is that when you observe y which depends on a parameter r (here the imputed value), and the joint likelihood (using \pi to denote “density of”) decomposes as:

\pi(y, r | \theta_1, \theta_2) = \pi(y | r, \theta_1)\pi(r | \theta_2)

then the marginal for y can be obtained as

\pi(y | \theta_1, \theta_2) = \int_{-\infty}^{+\infty} \pi(y | r, \theta_1)\pi(r | \theta_2) \mathrm{d}r

here \pi(y | r, \theta_1) is the observational model (depends on the imputed value r and some other parameters) and \pi(r | \theta_2) is the imputation model (which once again depends on other parameters). Note that for loo you need to get the logarithm of the density.

Since this is a 1D integral, it is quite likely it can be solved relatively efficiently by Stan’s integrate_1d or any other standard integration method. I.e. for each sample, you take all the parameters of the linear predictors for the imputed value and the main response, build a function that expresses the (non-log) likelihood of the observed value as a function of the imputed value times the (non-log) likelihood of the imputed value and then integrate this function with respect to the imputed value. The log of the result is IMHO the log-likelihood you want to use for loo.

Hope I didn’t mess up :-)

1 Like