Marginalising out missing categorical response variable cases provides inaccurate predictor estimates

Hi @Christopher-Peterson. Thanks!

I did wonder about this, and enquired about it at the end of this post: Log_mix for missing categorical data

The example of missing binary data seems to use the bernoulli probability estimated by the model as the mixing proportion, which obviously evaluates to P(z_n =1) P(1 | \phi) = \phi \cdot \phi.

I will have a go at including the mixing proportions and let you know.