I am currently using stan to estimate a quite complex model with multiple layers of latent variables. To put it simply, ideally I would like to run two parallel chains, and the parameter estimates (of a reduced model) from one chain can be passed to the full model run by another chain as data at every mcmc iteration. But I suspect that it is possible in Stan. So I wonder instead whether, within one chain, I can specify two models with two targets to achieve this?

Any input on this will be highly appreciated! Thank you so much.

What youâ€™re describing is having two sets of parameters, A and B, where parameters in A influence parameters in B but parameters in B do not influence parameters in A?

If so, then the answer is this is not possible in Stan. Iâ€™ll leave the â€śwhyâ€ť to some old posts (itâ€™s kindof a long explanation):

Itâ€™s possible to accomplish something like this with multiple imputation by running one model, sampling parameters from the posterior, feeding that into another model, and then repeating the sampling + second model steps a bunch and mixing the posteriors back together. It can be a bit slow doing this.

Youâ€™ll have to think about what that means/if that makes sense for your model though.

Thank you so much for the help, Ben.
Yes, you are right in describing the overall problem that I have.

For your suggested approach â€świth multiple imputation by running one model, sampling parameters from the posterior, feeding that into another model, and then repeating the sampling + second model steps a bunch and mixing the posteriors back togetherâ€ť. I am not sure whether I understand it or how to implement it. Do you mean write both model A and B in the â€śmodelâ€ť as one model? I also put the estimates from A into the transformed parameters for model B, so that it can be used directly in the â€śmodelâ€ť section.

Does this reflect the procedure you described? Thanks.

You can alternate models conveniently via rstan, though you have to do some warm-up plus one sampling iteration in each case. Whether itâ€™s a good idea is a whole other story.

A and B are two separate models. The values from the posterior of A would enter the B model through data. In super questionable pseudocode, the multiple imputation thing looks like:

Run model A
For i = 1:1000 {
get sample from posterior of A, call that theta
Run model B, feeding in theta
Save posterior from model B in list
}
Mix all the posteriors of all the B models together

Google around multiple imputation in Stan or cut in Stan and youâ€™ll probably see discussions that relate to your problem.

Truthfully the implementation probably isnâ€™t going to be very fun, and I donâ€™t think this sorta thing is advisable (just going from the Gelman blog and other times this conversation has come up). Youâ€™ll have to think about what it means for your problem if you use it.

Whatâ€™s wrong with your original, complex, model? Is it not sampling efficiently?

I figured out a solution similar to what you suggested. I write the joint likelihood of the model A and B, and also create different datasets for each model (even though in fact they share the same response variables). In this way, I have model A which is independent B, and can pass the fitted value from model A to model B in the model section.

Alternatively, I think I can run two stan models, A and B, in sequence. I pass the posteriors from A as data for model B. But based on my testing now, the above (simpler) version seems to work well.

Itâ€™s not possible form within the Stan language itself. You can do various things like this from the outside as others have suggested, but Stanâ€™s not ideally set up for it in all the interfaces yet (you need to do things like initialize mass matices, etc., each time).

What youâ€™re talking about is the â€ścutâ€ť operation in BUGS, which isnâ€™t Bayesian. We donâ€™t support that in Stan. The cases where itâ€™s kosher are things like the use of generated quantities in Stan (which donâ€™t feed back into the model).

This comes up with multiple imputation, whre you run model A to get many draws from the posterior. Then for each such draw from the posterior, you run model B, then you put everything back togetehr to get a joint posterior. This also isnâ€™t Bayesian, despite gluing two Bayesian steps together.

Iâ€™m curious about multiple imputation being described as non-Bayesian, since itâ€™s described as a â€śgold standardâ€ť in Plummerâ€™s â€śCuts in Bayesian graphical modelsâ€ť, especially when contrasted with BUGSâ€™s cut(). Are you suggesting that itâ€™s a bad approach?

I was actually there for Martynâ€™s presentation at MCMCski, but that link is paywalled.

Iâ€™m suggesting itâ€™s suboptimal and only done for computational expediency.

The â€śgold standardâ€ť would be building a joint model p(y, y_miss, theta) and doing full joint Bayesian inference for y_miss and theta (where y is observed data, y_miss missing data, and theta parameters).

Multiple imputation only approximates that in most cases. The reason itâ€™s not fully Bayesian is that thereâ€™s restricted information flow as described by Martyn Plummer (and in the BUGS book).

For a while, Andrew was working on a fully Bayesian alternative, but I donâ€™t know what ever became of that work.