I am working on a measurement error model with a latent categorical predictor variable. I want to do something that is very similar to a continuous measurement error model, but instead of a known standard error for each observation I am using a known simplex of probabilities which relates to the latent variable for each observation. For illustration, my data look like:
data.frame(y = c(1, 1, 0, 1, 0), pr_Xa = c(0.3, 0.2, 0.4, 0.7, 0.3), pr_Xb = c(0.1, 0.1, 0.5, 0.1, 0.4), pr_Xc = c(0.6, 0.7, 0.1, 0.2, 0.3), X_obs = c("c", "c", "b", "a", "b"))
A simple model without taking into account measurement error would look like
y ~ X_obs. Where
X_obs is the “observed” category for the latent variable which we get by only considering the highest value in the simplex.
But how can I include the measurement error simplex for each observation in a logistic regression model in Stan? I have tried adapting the code from this example, but if I’m understanding it right, that example assigns equal measurement error probabilities to each observation?