Including a posterior from previous experiment for a coefficient

A bit of an odd question perhaps, so please correct my thinking if I’m looking at this the wrong way.

I want to estimate a normal regression model with two covariates A and B:

y \sim N(\beta_0 + \beta_1 A + \beta_2 B,\sigma)

However, I do not currently have any data which I can use to estimate \beta_1 from, but I know the distribution of \beta_1 from a previous experiment. For instance, I may know that \beta_1 \sim N(-0.7,0.2).

I want to tell STAN about the knowledge I have about \beta_1, but I do not want STAN to treat this knowledge as a prior and expect data to guide the sampling from the posterior. I want STAN to treat my information as the posterior for \beta_1 and use this to estimate the model and sample the posterior for \beta_0 and \beta_2.

Can this be done, and if so, how would I code this up?

Thanks in advance.

Hi mben,

As far as I’m aware this is not possible. This is because the posterior distributions of these parameters are iteratively estimated based on the values of the other parameters. So while you might ‘know’ the value of \beta_1 at the first iteration, this will then inform the estimation of \beta_0 and \beta_2, which would in turn imply a different value for \beta_1 at a subsequent iteration than what you’ve supplied.

Additionally, as you might have guessed from the first part of my answer, the posterior distribution of \beta_1 isn’t as simple as saying that \beta_1 \sim N(-0.7, 0.2), the posterior distribution is every estimated value for a given parameter across the sampling period. So in the case of the Stan defaults of 4 chains with 2000 iterations (discarding the first 1000 as warmup), the posterior distribution of \beta_1 would be the 4000 estimated values of \beta_1 during sampling.

Does that help?

Cheers,
Andrew

Thank you Andrew for your explanation, this makes complete sense.

I’m thinking that I could solve this for my particular case by replacing \beta_1 A with a variable X and then treat X as missing data and impute it from the posterior (N(-0.7,0.2)) during sampling. Maybe not a general solution for the problem that I stated, but will likely work for what I am trying to do.