Greetings all,
I am curious about the use of sampling weights in Stan, and perhaps more generally in Bayes. I have read that as long as we have the covariates that were used in the probability of selection into the sample, we can just incorporate those in a typical hierarchical model. But, I have noticed that if we are actually supplied the weights (which are made up of probabilities within strata defined by the covariates), we can just use them as multiplicative constants in the likelihood. So, first, is there a preference and second, how would one use weights in a Stan set up for a normal linear hierarchical model? And, if one has weights at two sampling levels, how might that be set up? Please feel free to direct me to some existing Stan code out there.
Thanks
David
See discussion and references here.
This is great. Thank you all.
First, I think it is important to be clear what you are trying to do with the regression. If you want to estimate population parameters from a survey (e.g. political polls) with a non-random sample, you can use multilevel regression and post-stratification (MRP), or weights.
If you want to get an unbiased estimate of the association between a predictor and an outcome, the method should depend on your assumptions about the data generating process (only some of which you can check.)
Adjusting, MRP, and weighting can produce unbiased association estimates, but which works best depends on the relationship of covariates, predictor of interest, and outcome. This is, I think, best explained with direct acyclic graphs.
In short: (1) If your covariates are confounders (i.e. common causes of your predictor and outcome variable) and also not colliders on a backdoor path, then including your covariate into the regression, MRP, and weighting all work in theory. Which method is most reliable for a particular analysis depends on things like the number of covariates and data points. (2) If you have “only” effect modification/heterogeneous (treatment) effects of you predictor of interest, then MRP and weighting will work (in theory). (3) If one of your covariates (that predicts selection) is a collider, then only weighting can give you unbiased estimates.
1 Like