I have just started to discover the metric_file option from cmdstan and I must say it is really awesome. A few questions:
Should the engaged option be set to 0 if I give Stan the inverse mass matrix and the stepsize? I thought this would be the case, but I got very odd behavior such that I am confused. So when doing this Stan runs super-super fast, but gives me crap. Maybe this is a bug.
Is there some documentation on the mapping from the parameters to the unconstrained parameter space. In particular I would be interested in ordering.
Do we have recommendations or some guidance (likely with big warnings) what can be the minimal warmup when giving the metric file?
Can cmdstan export the inverse masses? Currently I use rstan to parse things out.
Can rstan already use the inverse mass matrix (@bgoodri? )?
This is getting really interesting in the context of MPI. Now you can use all available CPUs you have to get a single chain through warmup ASAP. Then you just start of as many chains as you like using the already calculated inverse weights. Certainly this is dangerous, but likely very handy for model-development.
Great new feature! Thanks to @mitzimorris, @bbbales2 and however else made this happen!
Should the engaged option be set to 0 if I give Stan the inverse mass matrix and the stepsize?
Can go either way. If you set it to zero, then there will be no adaptation, so you’ll just start sampling. It’s possible something is going crazy here, though I’d like to think not :P. I intended to do some work with this but never went beyond simple tests. I’ll look into this.
If you don’t adapt, you’re not letting it adapt the stepsize which would be a problem since that’s not (necessarily?) saved in the mass matrix? Realizing I don’t know if the mass matrix diagonal includes the stepsize when CmdStan spits it out.
here are a few differrent use cases in which this feature can be used:
need to run the sampler to generate more samples to increase precision of your quantities of interest - no further adaptation
want to continue adaptation, starting from specified mass matrix
the CmdStan manual, section “Euclidian Metric”, (starts on page 49) spells this out:
The metric_file option can be used with and without adaptation enabled.
If adaptation is enabled, the provided metric will be used as the initial guess in the adaptation process. If the initial guess is good, then adaptation should not change it much. If the metric is no good, then the adaptation will override the initial guess.
If adaptation is disabled, both a metric_file and stepsize should be provided to the sampler. Disabling adaptation disables both metric and stepsize adaptation, so a stepsize should be provided along with a metric to enable efficient sampling.
I think there is a bug… I did what you suggest in the manual. So I put num_warmup=0, I give the program an inverse mass matrix and I do pass in the stepsize. However, Stan ignores the stepsize which I have given to it. The fit runs extremely fast, but gives me nothing meaningful. Looking at the generated csv it is reported that a step size of 1 was used instead of what I provided at the command line. The mass matrix on the other hand is equal to what I had specified as it looks like.
Other than that, I think that the text in the manual should also tell users to give the correct initial which is needed to make the chain explore in exactly the same way the posterior.
Do we have a test which runs warmup + some iterations and then starts another chain setup such that it must generate the same iterations (specifying seed, as initial the last draw from warmup and the mass matrix)? This is what I would expect to happen.
@sakrejda : I am currently using rstan to parse the csv files generated from cmdstan. You do get a step size and the inverse mass matrix elements of the diagonal. However, the stepsize seems to be ignored right now.
Setting num_warmup=1 and engaged=0 essentially gives me the behavior I want. So with these settings then Stan does not change the stepsize and the mass matrix is maintained. Looks like it works like this for the application I have in mind (branch off different chains which share the same adaptation).
I’m having trouble reproducing it though. If I just use the bernoulli example with cmdstan, make a metric file (metric.dat, containing “inv_metric <- c(50.0)”), and then execute the binary with the command line:
Then the output file says the stepsize was actually set to 0.5:
...
# refresh = 100 (Default)
lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
# Adaptation terminated
# Step size = 0.5
# Diagonal elements of inverse mass matrix:
# 50
...
I also ran a regular Bernoulli calculation (with like 1000 post-warmup samples and 1000 warmup samples) and then initialized the inv_metric and stepsize from those numbers and things seemed to work.
Hmm… I think I had num_warmup=0 and engaged=1 (and stepsize + metric file) and then the stepsize was reset to 1. I can check tom again in case this does not lead to a reproducible disaster…
I have a similar bug using cmdStan v2.20.
So the bernoulli example works fine.
(metric_file is json with inv_metric = 0.550527)
This is the plot with adapt engage=0 metric_file = metric_file stepsize = 0.820551
So passing the inverse matrix helps but I see that this example already samples closely even wihout engaging.
I have issues trying to implement this to a model I am working on. The model works fine I tested it various times with an small example using the sample algorithm I get the following approximation.
Now when trying to use adapt engage=0 metric_file = metric_file stepsize = 0.639891
(inv_metric = array( c(0.00541849, 0.0232568, 0.0113541, 0.0143112, 0.0151313),
dim = 5 ) )
I get this disaster
No I am not doing warm-up. I try to see with warm-up but I had in mind to build a serialised code to submit to a cluster of workstations. Reading this I thought the inverse mass matrix saves the state of the Markov Chain so one can continue sampling is that correct? Still if a short wam-up works maybe I can still use it
Checked again using this:
# Adaptation terminated
# Step size = 0.639891
# Diagonal elements of inverse mass matrix:
# 0.00541849, 0.0232568, 0.0113541, 0.0143112, 0.0151313
and running even with 1000 wam-up iterations if engaged=0 it seems not to work, disaster again.
The error is like the beggining of the thread it runs super-fast but it does not seem to be working is the bug still persisting?
I use a wrapper that call cmdStan and using the following arguments:
Hi That worked great! Yes so the solution is then to keep adapt engaged=1 and use the parameters adapt init_buffer=0 window=0 term_buffer=50 num_warmup=50
Thank you!