-----**Edits**-----

Just for transparency and clarification, the initial post was about using `mi()`

for missing outcome variables in a non-linear model. That ultimately is not an issue; instead, it’s that the missing outcome is a discrete variable, which Stan takes issue with. The original question has been revised to reflect more relevant issues.

**Context**

I have item-level data for several different cognitive tests that I’d like to fit with IRT models. As with any dataset, there are missing data. Some of those missing data are from people just not being given the test, but a smaller proportion are from people refusing to answer a particular item. In the case of people not administered a test, I don’t really care about their missing data; however, for those who refused to answer an item, I’d like to be able to use the rest of their item responses to impute what their response might have been.

Similarly, there are some missing data points corresponding to someone answering “don’t know” to an item. Traditionally, these missing points are treated as errors, but I suspect that they’re better accounted for by a separate data generation process that I’d like to ideally model alongside the imputation of refusals and model-fitting for those with complete data.

**Problem**

The simplest case for the model formula with some missing outcomes would be the following:

```
formula <- bf(response | mi() ~ beta + exp(logalpha)*theta,
theta ~ 0 + (1 | ID),
beta ~ 1 + (1 |i| Item),
logalpha ~ 1 + (1 |i| Item),
nl = TRUE)
```

In essence, the goal is to estimate the parameters of the IRT model and use that as a way of predicting what the missing data for a particular person would be; however, the missing data are discrete variables (0/1) and thus not Stan-friendly with the out-of-the-box `mi()`

function. I know there are examples for estimating discrete parameters on this forum and in the Stan documentation; however, the mathematical details of specifying a model that marginalizes out the discrete parameters are beyond my applied statistics skills.

Related to my thoughts on modeling a separate data generation process for missing due to “don’t know” versus refusals, I don’t have a good sense for how to approach the problem. My thought was some kind of mixture model, but I don’t know how to signal to Stan/`brms`

that I want to impute missing data conditionally on the kind of “missingness”.

If it helps at all, what I hypothesize is that those who have “don’t know” responses represent different groups of individuals. Some of those are genuine “no clue, can’t tell you” answers that should be counted as errors, but some are that are “I don’t want to think too hard about this and just want to get out of here” answers that might not reflect true ability. If I can specify this as a mixture model, then I’ve got a multinomial/categorical likelihood for the mixtures (responds, refuses and doesn’t know, and refuses but could know), and I have some predictors of those group memberships (e.g., depression, subjective cognitive complaints, dementia worry, etc).

**Desired Solutions**

Hopefully this detail is enough to understand the modeling problem that I’m facing, and I’m open to any and all solutions that the community may have. A solution that works within `brms`

is ideal, but I’m fine as well just moving to straight Stan. I’m trying to avoid learning BUGS or JAGS just for this case, though perhaps the discrete parameters will render that an impossibility