Multiple imputation vs Full Bayesian approach for handling missing data in Stan


Recently I have read some literature stating that multiple imputation is more like a ‘compromise’ before we could do full Bayesian computing since it’s more like trying to use multiple imputed values (eg, 10 imputations) to mimic drawing a sample of size 10 from the full posterior distribution (since by that time it’s very hard to calculate the full posterioe distribution and sample from it).

I am not sure whether the above statement is correct or not. Since for my understanding of packages for multiple inputation (mice or mi), they seem to use iterative conditional regression method, but there seems to be less documentation in conducting multiple imputation in Stan.

Could I think that if we could handle missing data problem in Stan, there is no need to do multiple imputation? Or multiple imputation and full Bayesian approach are just two different ways to handle the problem and there is simply no global standard to say which one is better.

Hi, I think this post and the discussions in there could be of help:

Any MI model can be done more rigorously using full Bayes, but this can be harder work and the gains can be small.

1 Like

I’m wondering if anyone knows of any resources that have compared the two approaches? I’m curious how they stack up with one another under various conditions, particularly if/when they can have implications for inference.