Saved mass matrix

Hi!

I have just started to discover the metric_file option from cmdstan and I must say it is really awesome. A few questions:

  • Should the engaged option be set to 0 if I give Stan the inverse mass matrix and the stepsize? I thought this would be the case, but I got very odd behavior such that I am confused. So when doing this Stan runs super-super fast, but gives me crap. Maybe this is a bug.

  • Is there some documentation on the mapping from the parameters to the unconstrained parameter space. In particular I would be interested in ordering.

  • Do we have recommendations or some guidance (likely with big warnings) what can be the minimal warmup when giving the metric file?

  • Can cmdstan export the inverse masses? Currently I use rstan to parse things out.

  • Can rstan already use the inverse mass matrix (@bgoodri? )?

This is getting really interesting in the context of MPI. Now you can use all available CPUs you have to get a single chain through warmup ASAP. Then you just start of as many chains as you like using the already calculated inverse weights. Certainly this is dangerous, but likely very handy for model-development.

Great new feature! Thanks to @mitzimorris, @bbbales2 and however else made this happen!

Best,
Sebastian

2 Likes

Should the engaged option be set to 0 if I give Stan the inverse mass matrix and the stepsize?

Can go either way. If you set it to zero, then there will be no adaptation, so you’ll just start sampling. It’s possible something is going crazy here, though I’d like to think not :P. I intended to do some work with this but never went beyond simple tests. I’ll look into this.

If it’s not zero, then adaptation will still happen, but you’ve just given the warmup a hopefully better guess about where to start. I’ve not experimented with the adaptation itself, but the algorithms are in: https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/var_adaptation.hpp and https://github.com/stan-dev/math/blob/develop/stan/math/prim/mat/fun/welford_var_estimator.hpp (and there are separate files for estimating a full covariance matrix)

Do we have recommendations or some guidance (likely with big warnings) what can be the minimal warmup when giving the metric file?

I doubt the answer changes from anything else. Long enough :D

Re your other questions, I’d like to know the answers as well.

Not yet.

That sounds like it might be a bug. Can you capture the mass matrix at the end and compare?

If you don’t adapt, you’re not letting it adapt the stepsize which would be a problem since that’s not (necessarily?) saved in the mass matrix? Realizing I don’t know if the mass matrix diagonal includes the stepsize when CmdStan spits it out.

here are a few differrent use cases in which this feature can be used:

  • need to run the sampler to generate more samples to increase precision of your quantities of interest - no further adaptation

  • want to continue adaptation, starting from specified mass matrix

the CmdStan manual, section “Euclidian Metric”, (starts on page 49) spells this out:

The metric_file option can be used with and without adaptation enabled.
If adaptation is enabled, the provided metric will be used as the initial guess in the adaptation process. If the initial guess is good, then adaptation should not change it much. If the metric is no good, then the adaptation will override the initial guess.

If adaptation is disabled, both a metric_file and stepsize should be provided to the sampler. Disabling adaptation disables both metric and stepsize adaptation, so a stepsize should be provided along with a metric to enable efficient sampling.

1 Like

I think there is a bug… I did what you suggest in the manual. So I put num_warmup=0, I give the program an inverse mass matrix and I do pass in the stepsize. However, Stan ignores the stepsize which I have given to it. The fit runs extremely fast, but gives me nothing meaningful. Looking at the generated csv it is reported that a step size of 1 was used instead of what I provided at the command line. The mass matrix on the other hand is equal to what I had specified as it looks like.

Other than that, I think that the text in the manual should also tell users to give the correct initial which is needed to make the chain explore in exactly the same way the posterior.

Do we have a test which runs warmup + some iterations and then starts another chain setup such that it must generate the same iterations (specifying seed, as initial the last draw from warmup and the mass matrix)? This is what I would expect to happen.

@sakrejda : I am currently using rstan to parse the csv files generated from cmdstan. You do get a step size and the inverse mass matrix elements of the diagonal. However, the stepsize seems to be ignored right now.

Setting num_warmup=1 and engaged=0 essentially gives me the behavior I want. So with these settings then Stan does not change the stepsize and the mass matrix is maintained. Looks like it works like this for the application I have in mind (branch off different chains which share the same adaptation).

1 Like

that does sound like a bug. we’ll look into it.
glad you have a workaround.

1 Like

Had a quick look at this. Definitely sounds like a bug, and the original tests in cmdstan I coded up did not check this case.

I made an issue for this just so there’s an issue number to reference (https://github.com/stan-dev/cmdstan/issues/602).

I’m having trouble reproducing it though. If I just use the bernoulli example with cmdstan, make a metric file (metric.dat, containing “inv_metric <- c(50.0)”), and then execute the binary with the command line:

./bernoulli method=sample num_samples=1 num_warmup=0 adapt engaged=0 algorithm=hmc engine=nuts metric=diag_e metric_file=metric.dat stepsize=0.5 data file=bernoulli.data.R output file=output.csv

Then the output file says the stepsize was actually set to 0.5:

...
#   refresh = 100 (Default)
lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
# Adaptation terminated
# Step size = 0.5
# Diagonal elements of inverse mass matrix:
# 50
...

I also ran a regular Bernoulli calculation (with like 1000 post-warmup samples and 1000 warmup samples) and then initialized the inv_metric and stepsize from those numbers and things seemed to work.

I added more tests to the cmdstan metric_test.cpp file (which should be testing these interfaces) and am getting the expected behavior as well (can be found here: https://github.com/bbbales2/cmdstan-rus/commit/8c13b906621cc3011de124cd6b234425f917ba93).

Any ideas on what I’m doing different?

Hmm… I think I had num_warmup=0 and engaged=1 (and stepsize + metric file) and then the stepsize was reset to 1. I can check tom again in case this does not lead to a reproducible disaster…

I have a similar bug using cmdStan v2.20.
So the bernoulli example works fine.
(metric_file is json with inv_metric = 0.550527)
This is the plot with adapt engage=0 metric_file = metric_file stepsize = 0.820551

Now if I set adapt engage=0 metric_file = "" stepsize = 1

So passing the inverse matrix helps but I see that this example already samples closely even wihout engaging.

I have issues trying to implement this to a model I am working on. The model works fine I tested it various times with an small example using the sample algorithm I get the following approximation.

Now when trying to use adapt engage=0 metric_file = metric_file stepsize = 0.639891
(inv_metric = array( c(0.00541849, 0.0232568, 0.0113541, 0.0143112, 0.0151313),
dim = 5 ) )
I get this disaster

The model that I use if you want to test it is a bit complex and just something I am working on that I attach here for testing purposes if can help.simprogress4.data.R (4.9 KB) simprogress4.stan (2.6 KB) test-inv-matrix-1-metric-file.R (97 Bytes)init.R (102 Bytes)

Do you do any warmup?

It could be that your initial position is very far from the typical set. So does warmup help?

No I am not doing warm-up. I try to see with warm-up but I had in mind to build a serialised code to submit to a cluster of workstations. Reading this I thought the inverse mass matrix saves the state of the Markov Chain so one can continue sampling is that correct? Still if a short wam-up works maybe I can still use it

Checked again using this:
# Adaptation terminated
# Step size = 0.639891
# Diagonal elements of inverse mass matrix:
# 0.00541849, 0.0232568, 0.0113541, 0.0143112, 0.0151313

and running even with 1000 wam-up iterations if engaged=0 it seems not to work, disaster again.

The error is like the beggining of the thread it runs super-fast but it does not seem to be working is the bug still persisting?
I use a wrapper that call cmdStan and using the following arguments:

cmdStanTools::runModel2(model = "Model/model/simprogress4", 
                        data = "Model/data/simprogress4.data.R",
                        iter = 1000, warmup = 1000, thin = 1,
                        init = "Model/model/simprogress4/temp/1/init.R",
                        seed = 123456,
                        engaged = 0,
                        tag = "sample",
                        chain = 1, refresh = 100,
                        metric_file =  "Model/MetricFile/test-inv-matrix-1-metric-file.R",
                        adapt_delta = 0.9, stepsize = 0.639891)

where engaged is adapt engaged. Just pasted for the purpose of reviewing the arguments.

You’ll need to supply a stepsize or let that be adapted still. adapt engaged=0 turns timestep and metric adaptation off. Try:

./simprogress4 sample adapt init_buffer=0 window=0 term_buffer=50 num_warmup=50 algorithm=hmc metric_file=test-inv-matrix-1-metric-file.R data file=simprogress4.data.R init=init.R

(And yeah we should really document this, apologies for the frustration)

2 Likes

Hi That worked great! Yes so the solution is then to keep adapt engaged=1 and use the parameters adapt init_buffer=0 window=0 term_buffer=50 num_warmup=50
Thank you!

1 Like

noted: better documentation/discussion of sampler config for user-specified mass matrix · Issue #95 · stan-dev/docs · GitHub

1 Like