Using posteriors as new priors

That’s about right, but there’s a subtle point that I want to emphasize.

Given an observational model for both experiments, including which parameters are shared between the two experiments, \pi_{1} (y_{1} \mid \theta_{1}, \phi) and \pi_{2}(y_{2} \mid \theta_{2}, \phi), then the options are straightforward. In theory we could specify our posteriors sequentially, first combining the initial observational model and prior,

pi(\theta_{1}, \phi \mid y_{1}) \propto \pi(y_{1}, \mid \theta_{1}, \phi) \, \pi(\theta_{1}, \phi)

and then using that first posterior as prior to complement the second observational model (along with a new prior for the parameters specific to the second experiment),

\pi(\theta_{1}, \theta_{2}, \phi \mid y_{1}, y_{2}) \propto \pi(y_{2}, \mid \theta_{2}, \phi) \, \pi(\theta_{1}, \phi \mid y_{1}) \, \pi(\theta_{2} \mid \theta_{1}, \phi)

Or you could fit everything jointly. Using the first data you’d get

pi(\theta_{1}, \phi \mid y_{1}) \propto \pi(y_{1}, \mid \theta_{1}, \phi) \, \pi(\theta_{1}, \phi)

again but when incorporating the second you’d fit the first data set again,

\pi(\theta_{1}, \theta_{2}, \phi \mid y_{1}, y_{2}) \propto \pi(y_{2}, \mid \theta_{2}, \phi) \, \pi(y{1} \mid \theta_{1}, \phi) \, \pi(\theta_{1}, \theta_{2}, \phi).

At the cost of some redundancy you get to use computational tools that fit one model at at time, like Markov chain Monte Carlo.

Importantly, the decision to fit things with incremental posterior-to-prior sequences or with joint posteriors over all of the available data is independent of how the two experiments are modeled. In the example introduced by @mike-lawrence one has to figure out how to pool parameters between the two experiments regardless of how the incremental data are fit. While this modeling is really important, it is independent of the original question.

Ultimately I recommend fitting all of the data if possible. Streaming is best used for industrial applications where data are received in batches and inferences have to be computed on the fly and quickly to monitor the system being analyzed. In these cases priors approximately informed by previous posteriors are usually good enough.

1 Like