I am doing an applied project. I am wondering if there is any way to impute a distribution in Stan.
Separating a larger model into several smaller models can be either a computational approximation (e.g., replace a multilevel model with a two stage regression), or even a theoretical necessity (e.g., the calibration model and regression model can be assumed as independent in a measurement-error model, or the ''do" operator in causal inference.).
To start with, consider a model
data{
real y;
}
parameter{
real<lower=0> sigma;
}
model{
y~normal (beta, sigma);
}
Instead of making inference on the exact posterior distribution of beta given y, suppose in the ideal situation I have already fitted another calibration model and know beta is exactly 1.
Then I can impute beta into the data block
data {
...
real beta;
}
More realistically, I may only know the distribution of beta from the separately-fitted calibration model, say I know p(beta | calibration model )= N(0,1). And I assume beta is independent of y, such that p(beta | y)=N(0,1). Conceptually, I am imagining something like
data {
...
real beta;
beta~normal(0,1);
}
In other words, it imputes the value of beta by a pre-fixed distribution. If beta is discrete, I know how to do that, as I can marginalize out beta by finite mixture. When beta is continuous, I have to calculate the likelihood as an integration. Setting a prior beta~N(0,1) in the model block does not achieve the same effect.
Arguably, this is not the most Bayesian approach, for I should always update beta after I see y. Nevertheless at least as an approximation inference, is there anyway to impute the distribution (in the posterior distribution)?