I am looking to model missing covariates in a regression model using Stan and/or brms.
I am able to do this when the covariate is continuous, for example using the mi()
syntax in brms as in the “Imputation during model fitting” section here: https://cran.r-project.org/web/packages/brms/vignettes/brms_missings.html
However, when the covariate is discrete, I don’t believe this is possible because Stan isn’t able to use discrete parameters because it uses HMC. I’ve read about marginalization as a way to get around this issue, but I haven’t had much luck trying this approach.
For example, what if the missing covariate is binary, and I want to model it with a logistic regression? Or what if it is a count variable, and I want to model it as Binomial or Poisson? Or even as an ordinal variable with a proportional odds model? Because these missing discrete covariates would be treated as parameters in the main regression model of interest, I don’t see how to achieve this in Stan.
Would it be recommended to try another probabilistic programming language like NIMBLE or PyMC that are able to use HMC together with other MCMC samplers that can handle discrete parameters? For example, this R package JointAI
(https://www.jstatsoft.org/article/view/v100i20) uses JAGS to achieve the type of imputation I am seeking.
Any guidance on this issue would be greatly appreciated.