Measurement error in a categorical outcome variable

jeremy.koster · March 29, 2018, 8:18pm

Imagine a fairly standard multinomial outcome, such as the contraceptive method used by women in a survey dataset. For simplicity, let’s say that there are four different methods, dubbed a, b, c, and d.

Now suppose that some of the researchers wrote semi-illegible entries on their survey forms. As a result, it is subsequently not possible to discern the exact letter written on the form. Sometimes it is not possible to distinguish between a and d, for instance. But in that particular case, it’s definitely possible to say that the response is not b or c.

To model such data, what methods are used by members of the Stan community?

bgoodri · March 29, 2018, 9:45pm

Yeah, you basically have to marginalize over all of the ways the survey form could say Y = y, which is Pr(y | a) + Pr(y | b) + Pr(y | c) + Pr(y | d). See for example,
https://scholar.google.com/scholar?cluster=1703778821448663685&hl=en&as_sdt=0,33
but ignore all the pre-Stan stuff about how to draw from such posterior distributions.

jeremy.koster · March 29, 2018, 9:48pm

Thanks, Ben. I’ll take a look. Meanwhile, am I inferring that it’s possible in Stan to attach different probabilities to the outcomes, as in we might be 90% sure that it’s a with the remaining 10% probability attached to d?

bgoodri · March 29, 2018, 10:08pm

That should be fine, but you have to construct the probability vector accordingly. It may be better to just set the hyperpriors so that there is only a very small probability of the answer appearing to be on some category given that the truth was another category.

Topic		Replies	Views
Measurement Error Modeling Modeling specification	4	2287	January 9, 2018
Categorical Predictor with Measurement Error Regression Modeling specification	2	412	January 15, 2021
Using Stan for modelling changes in categorical outcomes: am I doing it right? Modeling specification	3	800	February 27, 2019
Modeling count data with known measurement error Modeling	2	511	November 3, 2022
Discrete Transformed Parameters Modeling	3	756	June 6, 2017

Measurement error in a categorical outcome variable

Related topics