Use parameter output in the first model as an input in the next stan model

Suppose parameter \alpha is estimated using Model 1. If we have only one chain with a length of 2000 samples, then the posterior sample of \alpha has the dimension 2000 * 1 in stan output. I want to use this parameter as a constant value in Model 2 (number of chains and samples like Model 1). I do not want to use the mean or median of the posterior sample of \alpha as a real value in Model 2, I want to use the value of \alpha in each iteration of Model 2. In fact, I use all 2,000 values of \alpha in Model 2.

How can I do this?

1 Like

In principle you can pass the samples from one model as data to the next model, whether or not this is reasonable depends on the question you are trying to answer.

Personally, I would not know when this makes sense, but my creativity is quite limited.

Maybe someone else knows?

How can do this? can you give me an example?

In principle you can do this, but you probably should not:

Model 1:

parameters {
 real x;
}
model {
 x ~ normal(0,1);
}

Model 2:

data {
 real x;
}
parameters {
 real y;
}
model {
 y ~ normal(x,1);
}

And then pass the samples from model 1 one at a time to model 2.

However, I suspect you might actually want to do something like the following:

parameters{
  real x;
  real y;
}
model{
  x ~ normal(0,1);
  y ~ normal(x,1);
}

I have not thought too hard about this, but the third model should be “better” and I’m not even sure whether model1+model2 do the same thing as model3, it might be so for this case but not for more complex cases?

Thank you. But my problem is that x in the output of the first model is a vector as long as the number of samples and not a real number.
Your latest model is fine, but I refer you to the question I asked on another topic.

This topic might be helpful to you:

2 Likes

This thread might also be of interest.

3 Likes

For example,

model_1 <- "
data {
int A;
vector[A] x;
}
parameters {
real mu;
real<lower = 0> sigma2;
}
model {
mu     ~ normal(0,10);
sigma2 ~ inv_gamma(2.1,2);
x      ~ normal(mu,sqrt(sigma2));
}
"

I want to use mu in below:

model_2 <- "
data {
int A;
vector[A] y;
???? mu;
}
parameters {
real lambda;
real<lower = 0> sigma2;
}
model {
lambda ~ normal(0,10);
sigma2 ~ inv_gamma(2.1,2);
y      ~ normal(mu + lambda , sqrt(sigma2));
}
"

How can use mu as an input data in model_2? My models are complex and this is a simple summary of my work.

@Funko_Unko’s first option is to fit your model two iteratively, using a different posterior draw of mu as input data each time. Is your question about how to write the appropriate for-loop to do this? Note that it involves fitting model 2 many times, and so it might be slow.

@Funko_Unko’s second option is to fit models 1 and 2 jointly, but I gather this is too slow for your purposes.

The third option is to fit an appropriate parametric approximation to the posterior distribution of mu, and to sample mu from that distribution jointly with the remainder of model 2. For a potentially useful family of parametric approximations to consider, read the post linked by @maxbiostat above. Do you have further questions about this strategy?

1 Like

first option:
You are right, I do not want to fit model 2 many times.

second option:
How can I write my model with this strategy? Because in the @Funko_Unko’s comment, x and y are parameters, while in my model, they must be data and mu and lambda are parameters.

third option:
Can you give a simple example according to my model?

Second option:
This involves fitting model 1 and model two jointly at the same time. Model 1 is re-fit here, so mu is a parameter. @Funko_Unko’s x and y are just dummy names for model parameters. They could represent your mu and lambda.

Third option:
Suppose that you conclude that the posterior distribution for mu from model 1 is a normal(a, b).
Then your model 2 would be

data {
  int A;
  vector[A] y;
  real a;
  real b;
}
parameters {
  real lambda;
  real<lower = 0> sigma2;
  real mu;
}
model {
  mu ~ normal(a, b);
  lambda ~ normal(0,10);
  sigma2 ~ inv_gamma(2.1,2);
  y ~ normal(mu + lambda , sqrt(sigma2));
}

@maxbiostat linked to a thread that includes a very useful, flexible family of distributions that could let you parameterize your distribution for mu even if it isn’t easily approximated by other familiar distributions that are implemented in Stan.

2 Likes

I’ll just add that two-step approaches to composing models in this way are generally hard and there is a big risk of errors and we have no good diagnostics to check against those errors. So before spending time on figuring out how to fit the models in sequence, I would first invest a bit of effort to try to make a joint model work, because then Stan’s diagnostics protect you against errors. If the joint model takes too long to fit or has other issues, this might easily signal real problems that don’t go away when using a 2 step fitting process.

I agree that there are use cases for 2 step process, I just think it is usually a desperate choice and should not be taken lightly. I did a two step fitting for one project where I really didn’t want to have information flow from the second model to the first model. I had hundreds of datasets for the second step that i wanted to share the same fitted parameters from the first step (and the joint model was out of reach precisely because I needed hundreds of datasets, for smaller data it worked great). So if you want to prevent the information in the second model from influencing estimates from the first model, that might a good reason for a two step process, but for most tasks, restricting information flow in this way is actually undesirable. In my case I did approximate the posterior with a multivariate normal and it wasn’t terrible, but it also did lose some information (the project: ODE Model of Gene Regulation)

Good luck with your project!

2 Likes

Another use case for a 2-step process that I’ve encountered is when the second model uses a derived quantity from the first model whose Jacobian is near-hopeless. In my case, I fit a bunch of logistic GAMs in the first step, and then the second step need to model the x-position where the GAM first achieved half of its maximum value on the probability scale. So the function is many-to-one, highly nonlinear, and very difficult to write down. I used independent normal approximations of these halfmax quantities, which… wasn’t terrible.

1 Like

This is a key motivation for Markov melding: Joining and Splitting Models with Markov Melding.

More generally, two step approaches are quite common in ODE models, Joint models, Hierarchical models/meta-analysis, etc. However the estimation procedures / samplers / tricks necessary for the two-stage approach to ‘work’ can be very specific to the application.

4 Likes