I have a hierarchical time-series model that takes about 11 hours to sample. Every day, I get an extra data point and I re-fit my model. The extra data that I get every day, is very small compared to the data I already have, but it does affect my estimates slightly.
My question is: Without changing my Stan code every day, can I somehow take advantage of the fitted model I have from the day before to speed up the sampling for today?
(I’m using PyStan, by the way)
Many thanks!
Are you starting from the known mass matrix and last estimates as initial values? That would give you something. You could probably take those and skip adaptation (or do two separate runs, one skipping adaptation and the other one to confirm the first estimates).
2 Likes
I’ve recently gotten interested in scenarios like this too.
@sakrejda what would one use as the intial values? Medians from the previous posterior? Also, is there any opportunity at any stage of sampling where we can gain more info about the best mass matrix?
I’m specifically thinking of the following scenario:
- Adapt using data x, yielding mass matrix M_x
- using M_x, sample using data x, obtaining posterior samples S_x
- observe new data y, concatenate with x to yield data xy
- using M_x and initial values informed by S_{x}, sample using data xy, yielding samples S_{xy}
- observe new data z, concatenate with xy to yield data xyz
- using M_x and initial values informed by S_{xy}, sample using data xyz, yielding samples S_{xyz}
etc…
So with the above, after the initial adaptation, the same mass matrix is used for all future sampling. I’m wondering, however, if there is any opportunity for updating whereby, for example, there’s information in S_{xy} that can be used to derive a new mass matrix M_{xy} to be used in step 6 instead of M_x. Or if there isn’t the proper info in S_{xy} to update the mass matrix, might the required info be anywhere in the latent dynamics observed during sampling?
1 Like
Aside from my attempt above to clarify sakrejda’s very neat suggestion, @omarfsosa you should consider posting your model as well in case there are areas for optimization we can identify.
Thank you both for your reply! This definitely sounds like what I’d like to do. (Also thank you @sakrejda, I’ll just continue on this thread to make things easier).
At the moment I am using the posterior mean to specify the init
values. I’m not using the mass matrix though! How can I do that? My StanFit object does not seem to have a get_inv_metric()
method or something like that. Assuming I can infer the mass matrix somehow, how do I then pass it to the sampling
method of my model (I don’t see an option for this in the source)?
(The control
dictionary can have an item called "metric"
but according to the docstring it can only be a string)
Thank you!
Most robust would probably be to use samples from the posterior, the medians might be meaningless (think banana shaped posterior). The problem is if changes in the posterior with the new data are not incremental this could perform much worse than a full run at any given point, and in any case you would need to re-train the mass matrix after a while anyway.
I’m not sure who is watching for PyStan questions (you might want to post a new question with that tag). The metric argument works but I’m not sure of the format…
Sorry, I thought that initial values had to be a single value per parameter, so if you had:
parameters{
real mu ;
real<lower=0> sigma ;
}
model{
y ~ normal(mu,sigma) ;
}
You would need to pass in two init values, one for mu and one for sigma. Is this not the case?
yes, I’m saying it’s better to pick two particular values from particular samples rather than try to get fancy and summarize the posterior into statistics. You might have to try some different ones to get a good start but using something like means or medians that ignores posterior structure will easily produce non-starter values.
1 Like
Ah, I think I follow now; there’s danger in missing important multivariate structure by collapsing to marginal summaries, so better to simply grab a random sample.
I have '2.19.0.0'
, this is the version that. was installed by default (I’m using anaconda).
Thank you for the link it’s very useful, though I understand it as being specific for “continued” sampling, whereas I’m just looking for a sensible starting point. Am I correct in thinking that (provided I get the right pystan version) all I’d need to change is adapt_engaged=True
?