I do not have any advice for fponce, but just wanted to share that I have a very similar problem. I have an ever-expanding pool of data, a model that needs to see all the data at once, and I need to produce estimates from new data in a relatively short period of time (10 min as an upper limit before people start complaining, I’d guess). Eventually, there will be too much data, and I’ll run out of tricks to improve performance and money to buy more compute.
I’m not really sure when that day would come, but it would be nice to have some approach that can produce pretty good estimates in the short term, using previously computed posteriors as priors, and then compute the full posterior from all the data at a later time.