I am trying to create a model on the following data (the data provided are just 15 rows out of the whole dataset):
X1 X2 X3 X4 X5 Y X6
3 0 2 2 1 2 1
2 0 3 3 1 3 1
1 0 2 NA 1 2 1
3 1 2 1 1 1 0
3 1 3 3 3 2 NA
1 0 2 1 3 3 NA
3 1 1 3 3 1 1
1 0 1 3 2 1 0
3 1 NA 3 3 2 0
1 0 NA 1 1 2 1
2 1 2 3 1 2 NA
3 1 1 3 2 2 0
2 0 2 1 3 3 0
3 0 NA 3 2 2 0
3 0 3 3 3 3 1
Where I am trying to predict Y based on the rest of the variables (X1,…,X6). I use categorical distribution for the likelihood with dirichlet prior, when I omit the NA values. Though as both Y and Xs have missing values, I decided to create a second model and include them as well, so I use categorical distribution for X1, X3, X4, X5 and Bernoulli for X2 and X6 in order to predict their missing values as well. Though as I am a new joiner to Stan, I can not write the model properly and I can find any similar situation in order to copy, paste and modify the code. If anyone could propose a model to start with or a similar example, I would be glad.