In the past I have had success modelling discrete missing outcome variables by marginalizing over the missing values with log_mix. However, I don’t know how to extend this approach to more than two categories. I had been dealing with missing binary data with the following:

```
target += log_mix( theta[i] ,
binomial_logit_lpmf( 1 | 1, p[i] ),
binomial_logit_lpmf( 0 | 1, p[i] )
```

Where theta is the probability that y = 1, and p is a linear model of theta on the logit scale. How could I extend this strategy to more than two levels, say if I wanted to marginalize over three possible categorical values of y?