Censored Data: Modeling vs Integrating Out

Hi -

I have longitudinal data for which a large amount is “censored” or “missing”. The Stan manual tells me how to implement these methods in Stan code. But I’m generally more curious as to the effect of these approaches and how they bias the posterior.

I’ve done imputation by regression before, as suggested in Gelman and Hill 2009, but I can’t just use that trick here and add noise, both due to the structure of the data, and the way in which it’s missing (at certain time points I have entire entries drop out, or they have a response and no covariates the researcher is interested in).

Any papers anyone favors? Any initial thoughts/suggestions before I get sucked into the literature wormhole?

Thanks all!


oh I have BDA3 in front of me, I’ll take a look there :)

They don’t.

OK - affect of one approach verses the other?

If you can integrate the missing values out analytically, that is presumably going to be better computationally because the posterior distribution is lower-dimensional.

1 Like

Can you please provide a citation?

nvm solved