Suppose we’ve estimated a posterior, conditional on data1, and we’d like to update parameters \theta conditional on data1 and data2. I know the Stan group has done this, where is the relevant paper? moreover, I’m looking to combine parameter estimates from disjoint datasets but generated from the same process with the same covariates. I haven’t hit the computer yet. I have an intuition but need some external validation.
Any references, groups or authors I can check out?
~ regards
I’m thinking basic Bayesian model averaging, right?
Resample theta1, theta2 until there’s no autocorrelation and the traceplots look good, no? But I’m looking for theory.
Hi @drezap! The first question has been an interest of mine for awhile, but I’m not aware of an absolutely robust way to do this with Stan besides simply refitting the model with all the data.
The method outlined has worked well for me in the past and can limit the number of refits in a context where you have people uploading data all the time and hoping for fresh estimation. There is literature on Sequential MCMC, but I haven’t dived deeply into it since I fixed my immediate issues with PSIS. If anyone has plans to add a sequential sampler algorithm that works with existing Stan models, I would like to digitally hug that person.
This is the only way to do this in general with a black-box MCMC system. You can warm start from draws and mass matrices from the previous fit. Or you can try (Pareto smoothed) importance (re)sampling as referenced here. It’s what we do for LOO.
The usual citation here is sequential Monte Carlo. The papers are very abstract, but just try to find an intro to the simple particle filter approach. SMC brings in the problem of tempering and also variance in importance sampling.
Stan doesn’t support SMC, nor do I know of a robust software package that does. Maybe someone else can comment.