How would LOO with multiple imputations look like?

daniel_h · November 20, 2019, 12:24am

To my knowledge, using loo with brm_multiple objects calculates the elpd on the first dataset only.
Is there any consensus or ideas how LOO with multiple imputated datasets would look like?

I think ideally one would like a solution that evaluates the combined posterior of the m imputed datasets. So maybe something like:

For each of the m imputed datasets, use standard PSIS-LOO to generate samples from the LOO posterior for leaving out data point y_{i}.
Combine the LOO posterior draws
Use the combined posterior to evaluate the lpd for each y_{i} of the imputed datasets.

Does this even make sense? I feel like I’m missing something obvious here.

avehtari · November 20, 2019, 1:13pm

Are you imputing just covariates or also target variable y?

Assuming you are imouting just teh covariates, and you are using R and loo package

For each of the m imputed datasets compute loo as usual
each loo object has pointwise log predictive densities loo_object$pointwise[,'elpd_loo']
compute means of pointwise predictive densities (using exp(loo_object$pointwise[,'elpd_loo']) where means are over the m imputed datasets (the result has n pointwise predictive densities)
elpd_loo is then sum of log of pointwise predictions (from step 3.)

daniel_h · November 20, 2019, 4:03pm

Thank you!

Yes I’m mostly thinking about the case of just imputing the covariates.
So it is enough to calculate elpd on the individual datasets and then average? I felt a bit reluctant to do this because the posterior that is actually used later on is the combined posterior and here we are evaluating the individual posteriors instead.

avehtari · November 20, 2019, 7:56pm

No. Calculate pointwise predictive densities on the individual datasets, average over datasets, take a logarithm and then sum over log pointwise predictive densities to get elpd.

No. The averaging the pointwise predictive densities over the datasets makes the result use the combined posterior.

You just have to be careful what you average here.

trinhdhk · August 11, 2020, 6:20am

Is there any proper way to calculate the average of other loo statistics also? Like SE, p_loo, and k?

Topic		Replies	Views
Model comparison for multiple imputation with brm_multiple Modeling loo , cross-validation , model-comparison , brms , missing-data	1	95	September 20, 2024
Projpred with multiple imputations (brm_multiple) - does this example work? Modeling loo , projpred , brms	3	556	June 15, 2023
Advice about LOO General	1	422	June 29, 2021
Feature request: other loss functions in loo General loo	3	586	February 18, 2020
Problem running loo_subsample(), wheras loo() works Modeling loo , cmdstanr	2	349	November 29, 2023

How would LOO with multiple imputations look like?

Related topics