Estimating pre-treatment baseline

simonkamronn · November 14, 2017, 10:40am

I’m estimating treatment effects from a long running experiment with thousands of observations for each participant. Because they all start from different baselines I want to use the first two weeks of data before the intervention as a pre-treatment baseline, but I’m uncertain as how to specify that in Stan. The model is along the lines of
$$
y_i = \mu + \mu^{pre}{s[i]} + \tau{t[i]} + \alpha_{g[i]} + \beta X_i + \epsilon
$$
where $\mu^{pre}_{s[i]}$ is the subject specific baseline. I guess I can specify two models for pre- and post-treatment e.g.

y_pre ~ normal(intercept + baseline[subject_ind_pre], sigma_eps);
y_post ~ normal(intercept + baseline[subject_ind_post] + treatment[treatment_ind_post], sigma_eps);

but how do I make sure baseline is only learnt from the pre-treatment data? Or is that not necessary?

Bob_Carpenter · November 27, 2017, 8:47pm

Not sure why nobody answered this one, but I’m making a cleanup run on the list so I’ll give it a go.

In Bayesian stats, information flows in both directions. And this is almost always a good thing.

In your case, information will flow from y_post into estimating baseline and sigma_eps and intercept. This means the post-treatment data will help estimate the pre-treatment baseline.

Why do you think the shared parameters should only be estimated based on pre-treatment data?

simonkamronn · November 28, 2017, 8:22am

Thanks for answering. Since the baseline adds an intercept for each participant and I have some baseline covariates I want to include, I’m worried about collinearity and not being able to identify my treatment effect. I’m in the unfortunate position of not having a large number of participants and already battling with collinearity.

Bob_Carpenter · November 28, 2017, 5:57pm

A hierarchical prior here should help identify the model. You want to make sure to center your individual-level intercepts around zero if you have a global intercept. I’d recommend going even further and using a non-centered parameterization (that also pulls out the scale of the prior and makes it much better behaved for Stan to fit).

Having more data from the treatment feed back into shared parameter estimates should help with identifiability, not hurt.

Topic		Replies	Views
Including a posterior from previous experiment for a coefficient Modeling	2	416	September 15, 2020
Simple question on hierarchical non-centered parameterization Modeling	3	2026	October 16, 2017
Estimating covariance of random effects Modeling	5	1188	October 12, 2017
Joint Model Specification Modeling	3	955	November 8, 2019
GLMM posterior Modeling	0	421	October 31, 2018

Estimating pre-treatment baseline

Related topics