Model comparison for multiple imputation with brm_multiple

Similar issue to the one here: Using model comparison (loo or waic) after imputation, but I can’t find any discussion that is more recent than 2021, so I’d like to revive the topic.

Using brms. I have a dataset with about 9000 observations where the outcome variable is a latent variable, so I’m using a calibration dataset to fit a linear model for an indicator variable, then posterior_predict to generate list of datasets with imputed values, and brm_multiple to fit models on those imputed datasets. Similar to what is described here: Handle Missing Values with brms

I’m hoping to compare a number of models using k-fold cross validation, as this seems like the best method for a large dataset like this. When I do this (or other post-fitting diagnostics like loo, waic, or bayes_R2, I get the message


Warning: Using only the first imputed data set. Please interpret the results with caution until a more principled approach has been implemented.

So my question - is there any more principled approach? How would you compare model structures in a dataset like this one?

  1. compute cross-validation with each imputed dataset
  2. average pointwise elpd values from these different cross-validations and save them back to a loo/kfold object as pointwise elpd values
  3. use that new loo/kfold object as usual
3 Likes