Missing data imputation in Stan

stan_beginer · November 6, 2020, 2:00pm

I a little bit confused about missing data imputation in Stan, for a simple example Y and X have the following relation:

data{
vector[N] y;
}
model {
  vector[N] mu = alpha + beta * x;
  y ~ normal(mu, sigma); 
  alpha ~ normal(0,100);
  beta ~ normal(0,100);
}

Suppose now we have missing data issue, where we observe all Ys (eg. a vector of length n) but only some of the Xs (eg, a vector of length 2n/3) and we are interested in imputing the missing values of X. I am wondering in this case should I put X into ‘data’ or ‘parameter’ section?

Also I am confused about the general ideas of using Bayesian methods for missing data imputation. In my understanding we should treat the missing data as ‘parameters’ in Bayesian setting. However, in the above situation we also observe 2n/3 of the data, so if we purely put all X as ‘paramter’ it seems not making sense. Should we treat the observed 2n/3 of X as observed data and missing n/3 of X as parameters?

Thx!

andrjohns · November 6, 2020, 2:08pm

I’d recommend reading through the section in the User’s Manual on Missing Data: https://mc-stan.org/docs/2_25/stan-users-guide/missing-data-and-partially-known-parameters.html

If you have access, I’d also suggest reading the 2nd edition of Statistical Rethinking by Richard McElreath (Chapter 15), which covers this in more depth

mike-lawrence · November 7, 2020, 2:00am

I also have a lecture here discussion of missing data starts at around the halfway mark).

stan_beginer · November 7, 2020, 9:01pm

That’s very interesting resources and thanks again!

Topic		Replies	Views
Missing data Modeling	1	602	October 6, 2018
Partially missing vector Modeling specification	3	372	February 22, 2021
Merging Missing and Observed Data in Regression Model Modeling	15	899	January 22, 2020
Can't understand an example for handling missing value in rstan Modeling rstan , missing-data	1	827	June 26, 2022
Missing parameters and priors Modeling	25	810	June 27, 2020

Missing data imputation in Stan

Related topics