Projection is based on minimizing the KL-divergence from the reference model predictive distribution to the constrained model predictive distribution for each reference posterior draw separately. For many data model distributions, this is equivalent or can be approximated with optimization of the constrained model parameters given mean of the reference model prediction for each reference model posterior draw. In case of joint missing data imputation, each reference model posterior draw includes draw from the missing data distribution, too. The optimization approach to minimize KL does not work well for these latent data parameters. We could approximate by keeping the latent data parameters as fixed, and optimize only other parameters, but then this would be the same as using multiple imputation approach, which is a big task to add as discussed in a github issue.