Use posterior from one model as data for another?

I am fitting a model to forecast an outcome into the future. The model is a Poisson model with an offset that is equal to the population size. Since the population size is not known, I need to build a separate model to create population projections to use for the offset.

From the population projection model, I get the posterior for the expected population in each future year. I would like to incorporate the uncertainty of these projections into my main model. Is there a good way to do this? I’m imagining an approach where each iteration of the final model “sees” a different version of the population projections, i.e. the one corresponding to a different sample from the posterior. I would use the same number of iterations in both models to make this easier.

I’m not sure if this makes sense at all, because I imagine it might make sampling difficult if the data is changing from iteration to iteration. But I also imagine these two models could be combined into one Stan program, one that includes both sub-models. So it would seem what I’m trying to do is very similar to this latter approach, it’s just easier for me to think about separating the two models.

Is this possible in Stan, and if so how would I do it? It would be relatively straightforward to feed in the data as an array, I’m just not sure if there’s a way to index a specific iteration of the data.

Oh, and ideally I would be able to do this in brms. But I think I have a workaround for that (using the empty argument).

1 Like

I had a similar (but less sophisticated) question recently and the responses to that could be helpful: How to estimate a parameter for use in another model in brms?.

Particularly the Stan code that @andrjohns shared. The key point was considering the two sub-models as part of one hierarchical model and fitting each within the same Markov chain.

1 Like

Thanks @LachlanC for pointing me to this solution. I was aware that the two models could be combined into a single MCMC. However, for several reasons, I specifically want to separate the two models.

FYI - the main reason I want to separate the two models is because I want my estimates of the population size to incorporate sampling uncertainty, i.e., they are the result of brms::posterior_predict(). Maybe there’s a way to code up this sampling directly in Stan, but I don’t know how to do it, and it seems messy. If there was a way to feed a posterior in as data, that would be much simpler.

When I run into this, it’s one of two cases:

  1. The posterior I want to substitute is a covariate/linear predictor in some regression. Since we don’t model covariates in regression problems, you can just grab a couple draws (50 should do it) from your 1st posterior, call posterior_predict() on each of the draws, and then mix all the draws together. This is I think what you are dealing with here.

  2. I want my posterior in one model to become a prior on a coefficient in a second model, and for computational/logistical/etc reasons I can’t just fit one big model. Then I use MASS:fitdistr to fit a parametric distribution to the posterior (student-t works well for unconstrained parameters). Then when I code up my 2nd-stage model, I give the parameter the corresponding prior & pass in the arguments from fitdistr through the data block.

Hope that’s helpful (and I’m not missing your question).

@cmccartan your second option here is a really neat idea. I’ll use that in future thanks for sharing

The straightforward approach is to use (an approximation, more conviniently, of) the posterior distribution from one model as the prior for the next one.

If \beta is your offset parameter and \theta the rest of your parameters, the probability of your second model model would be p(\theta,\beta|D) and conditioning on a given value of beta and incorporating uncertainty from the posterior distribution gives you p(\theta|\beta,D)p(\beta). I’m pretty sure you can obtain exactly the same expression for the posterior as if you are estimating \beta together with everything where p(\beta) is the prior, except it would be coming from another model’s inference.

Some care should be taken when using a posterior as a prior, though. Ideally the estimate should come from different data to avoid the risk of “double dipping”, but it is also possible to use the same data if the models are sufficiently different (i.e. each model is extracting different information from the data). Otherwise it’s likely that successive rounds of replacing the prior with the previous posterior will increasingly bias your estimate.