These questions came out of conversations I had today with @mariel and @ignacio.
In the stan_lm vignette, @jonah and @bgoodri suggest investigating the average treatment effect using posterior_predict
.
clouds_cf <- clouds
clouds_cf$seeding[] <- "yes"
y1_rep <- posterior_predict(post, newdata = clouds_cf)
clouds_cf$seeding[] <- "no"
y0_rep <- posterior_predict(post, newdata = clouds_cf)
qplot(x = c(y1_rep - y0_rep), geom = "histogram", xlab = "Estimated ATE")
From what I understand (please correct me if this is wrong), posterior_predict
is not just calculating \hat y_i = X_i\hat\beta (the expected value of y); it is drawing a new \tilde y_i from the posterior for each MCMC iteration. This incorporates the residual variability into the prediction.
First off, is there a problem with the above calculation, in that it assumes the draws y1_rep
and y0_rep
are independent? I would think you want to make this calculation in a way that takes into account the fact that these are the same people (really the same observation) in each case. Another way to look at it, is that I would expect y_1 - y_0 to be smaller if you are comparing those observations on the same person, as compared to two different people that happen to have the same covariates. I don’t think your suggested procedure accounts for this.
Second, I just want to point out that this is not the “standard” way of calculating an ATE, which is done by most software packages that I’ve come across. Most of the time, \hat y_i = X_i\hat\beta is used as the prediction, and the residual variability is ignored. When this is done, the issue I raise above is moot. I’m not saying that your approach is wrong, just pointing out that it is not the standard.
I’m curious about your thoughts to these issues. Thanks!