Saved mass matrix

wds15 · January 8, 2018, 11:01am

Hi!

I have just started to discover the metric_file option from cmdstan and I must say it is really awesome. A few questions:

Should the engaged option be set to 0 if I give Stan the inverse mass matrix and the stepsize? I thought this would be the case, but I got very odd behavior such that I am confused. So when doing this Stan runs super-super fast, but gives me crap. Maybe this is a bug.
Is there some documentation on the mapping from the parameters to the unconstrained parameter space. In particular I would be interested in ordering.
Do we have recommendations or some guidance (likely with big warnings) what can be the minimal warmup when giving the metric file?
Can cmdstan export the inverse masses? Currently I use rstan to parse things out.
Can rstan already use the inverse mass matrix (@bgoodri? )?

This is getting really interesting in the context of MPI. Now you can use all available CPUs you have to get a single chain through warmup ASAP. Then you just start of as many chains as you like using the already calculated inverse weights. Certainly this is dangerous, but likely very handy for model-development.

Great new feature! Thanks to @mitzimorris, @bbbales2 and however else made this happen!

Best,
Sebastian

bbbales2 · January 8, 2018, 5:03pm

Should the engaged option be set to 0 if I give Stan the inverse mass matrix and the stepsize?

Can go either way. If you set it to zero, then there will be no adaptation, so you’ll just start sampling. It’s possible something is going crazy here, though I’d like to think not :P. I intended to do some work with this but never went beyond simple tests. I’ll look into this.

If it’s not zero, then adaptation will still happen, but you’ve just given the warmup a hopefully better guess about where to start. I’ve not experimented with the adaptation itself, but the algorithms are in: https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/var_adaptation.hpp and https://github.com/stan-dev/math/blob/develop/stan/math/prim/mat/fun/welford_var_estimator.hpp (and there are separate files for estimating a full covariance matrix)

Do we have recommendations or some guidance (likely with big warnings) what can be the minimal warmup when giving the metric file?

I doubt the answer changes from anything else. Long enough :D

Re your other questions, I’d like to know the answers as well.

bgoodri · January 8, 2018, 5:14pm

Not yet.

sakrejda · January 8, 2018, 5:54pm

That sounds like it might be a bug. Can you capture the mass matrix at the end and compare?

sakrejda · January 8, 2018, 5:55pm

If you don’t adapt, you’re not letting it adapt the stepsize which would be a problem since that’s not (necessarily?) saved in the mass matrix? Realizing I don’t know if the mass matrix diagonal includes the stepsize when CmdStan spits it out.

mitzimorris · January 8, 2018, 9:47pm

here are a few differrent use cases in which this feature can be used:

need to run the sampler to generate more samples to increase precision of your quantities of interest - no further adaptation
want to continue adaptation, starting from specified mass matrix

the CmdStan manual, section “Euclidian Metric”, (starts on page 49) spells this out:

The metric_file option can be used with and without adaptation enabled.
If adaptation is enabled, the provided metric will be used as the initial guess in the adaptation process. If the initial guess is good, then adaptation should not change it much. If the metric is no good, then the adaptation will override the initial guess.

If adaptation is disabled, both a metric_file and stepsize should be provided to the sampler. Disabling adaptation disables both metric and stepsize adaptation, so a stepsize should be provided along with a metric to enable efficient sampling.

wds15 · January 9, 2018, 3:27pm

I think there is a bug… I did what you suggest in the manual. So I put num_warmup=0, I give the program an inverse mass matrix and I do pass in the stepsize. However, Stan ignores the stepsize which I have given to it. The fit runs extremely fast, but gives me nothing meaningful. Looking at the generated csv it is reported that a step size of 1 was used instead of what I provided at the command line. The mass matrix on the other hand is equal to what I had specified as it looks like.

Other than that, I think that the text in the manual should also tell users to give the correct initial which is needed to make the chain explore in exactly the same way the posterior.

Do we have a test which runs warmup + some iterations and then starts another chain setup such that it must generate the same iterations (specifying seed, as initial the last draw from warmup and the mass matrix)? This is what I would expect to happen.

@sakrejda : I am currently using rstan to parse the csv files generated from cmdstan. You do get a step size and the inverse mass matrix elements of the diagonal. However, the stepsize seems to be ignored right now.

wds15 · January 9, 2018, 3:57pm

Setting num_warmup=1 and engaged=0 essentially gives me the behavior I want. So with these settings then Stan does not change the stepsize and the mass matrix is maintained. Looks like it works like this for the application I have in mind (branch off different chains which share the same adaptation).

mitzimorris · January 9, 2018, 4:08pm

that does sound like a bug. we’ll look into it.
glad you have a workaround.

bbbales2 · January 9, 2018, 6:19pm

Had a quick look at this. Definitely sounds like a bug, and the original tests in cmdstan I coded up did not check this case.

I made an issue for this just so there’s an issue number to reference (https://github.com/stan-dev/cmdstan/issues/602).

I’m having trouble reproducing it though. If I just use the bernoulli example with cmdstan, make a metric file (metric.dat, containing “inv_metric <- c(50.0)”), and then execute the binary with the command line:

./bernoulli method=sample num_samples=1 num_warmup=0 adapt engaged=0 algorithm=hmc engine=nuts metric=diag_e metric_file=metric.dat stepsize=0.5 data file=bernoulli.data.R output file=output.csv

Then the output file says the stepsize was actually set to 0.5:

...
#   refresh = 100 (Default)
lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
# Adaptation terminated
# Step size = 0.5
# Diagonal elements of inverse mass matrix:
# 50
...

I also ran a regular Bernoulli calculation (with like 1000 post-warmup samples and 1000 warmup samples) and then initialized the inv_metric and stepsize from those numbers and things seemed to work.

I added more tests to the cmdstan metric_test.cpp file (which should be testing these interfaces) and am getting the expected behavior as well (can be found here: https://github.com/bbbales2/cmdstan-rus/commit/8c13b906621cc3011de124cd6b234425f917ba93).

Any ideas on what I’m doing different?

wds15 · January 9, 2018, 6:30pm

Hmm… I think I had num_warmup=0 and engaged=1 (and stepsize + metric file) and then the stepsize was reset to 1. I can check tom again in case this does not lead to a reproducible disaster…

csetraynor · October 17, 2019, 5:01pm

I have a similar bug using cmdStan v2.20.
So the bernoulli example works fine.
(metric_file is json with inv_metric = 0.550527)
This is the plot with adapt engage=0 metric_file = metric_file stepsize = 0.820551

Now if I set adapt engage=0 metric_file = "" stepsize = 1

So passing the inverse matrix helps but I see that this example already samples closely even wihout engaging.

I have issues trying to implement this to a model I am working on. The model works fine I tested it various times with an small example using the sample algorithm I get the following approximation.

Now when trying to use adapt engage=0 metric_file = metric_file stepsize = 0.639891
(inv_metric = array( c(0.00541849, 0.0232568, 0.0113541, 0.0143112, 0.0151313),
dim = 5 ) )
I get this disaster

The model that I use if you want to test it is a bit complex and just something I am working on that I attach here for testing purposes if can help.simprogress4.data.R (4.9 KB) simprogress4.stan (2.6 KB) test-inv-matrix-1-metric-file.R (97 Bytes)init.R (102 Bytes)

ahartikainen · October 17, 2019, 5:23pm

Do you do any warmup?

It could be that your initial position is very far from the typical set. So does warmup help?

csetraynor · October 17, 2019, 5:28pm

No I am not doing warm-up. I try to see with warm-up but I had in mind to build a serialised code to submit to a cluster of workstations. Reading this I thought the inverse mass matrix saves the state of the Markov Chain so one can continue sampling is that correct? Still if a short wam-up works maybe I can still use it

csetraynor · October 17, 2019, 5:35pm

Checked again using this:
# Adaptation terminated
# Step size = 0.639891
# Diagonal elements of inverse mass matrix:
# 0.00541849, 0.0232568, 0.0113541, 0.0143112, 0.0151313

and running even with 1000 wam-up iterations if engaged=0 it seems not to work, disaster again.

csetraynor · October 17, 2019, 5:37pm

The error is like the beggining of the thread it runs super-fast but it does not seem to be working is the bug still persisting?
I use a wrapper that call cmdStan and using the following arguments:

cmdStanTools::runModel2(model = "Model/model/simprogress4", 
                        data = "Model/data/simprogress4.data.R",
                        iter = 1000, warmup = 1000, thin = 1,
                        init = "Model/model/simprogress4/temp/1/init.R",
                        seed = 123456,
                        engaged = 0,
                        tag = "sample",
                        chain = 1, refresh = 100,
                        metric_file =  "Model/MetricFile/test-inv-matrix-1-metric-file.R",
                        adapt_delta = 0.9, stepsize = 0.639891)

where engaged is adapt engaged. Just pasted for the purpose of reviewing the arguments.

bbbales2 · October 17, 2019, 7:14pm

You’ll need to supply a stepsize or let that be adapted still. adapt engaged=0 turns timestep and metric adaptation off. Try:

./simprogress4 sample adapt init_buffer=0 window=0 term_buffer=50 num_warmup=50 algorithm=hmc metric_file=test-inv-matrix-1-metric-file.R data file=simprogress4.data.R init=init.R

bbbales2 · October 17, 2019, 7:15pm

(And yeah we should really document this, apologies for the frustration)

csetraynor · October 17, 2019, 9:20pm

Hi That worked great! Yes so the solution is then to keep adapt engaged=1 and use the parameters adapt init_buffer=0 window=0 term_buffer=50 num_warmup=50
Thank you!

mitzimorris · October 18, 2019, 1:45pm

noted: better documentation/discussion of sampler config for user-specified mass matrix · Issue #95 · stan-dev/docs · GitHub

Topic		Replies	Views
Extracting the mass matrix (Euclidean metric) in rstan RStan	5	1086	June 16, 2020
Passing a different mass matrix for each chain in cmdstanr Interfaces	7	704	January 13, 2021
Using a fixed step size for HMC (or NUTS) and other sampler options Algorithms mcmc	21	3309	June 27, 2019
Using adaptation info for ODE stuff? Developers	6	787	February 28, 2017
Minimizing warmup iterations - Error reading step size from CmdStan output Interfaces bug , cmdstanpy	4	471	August 15, 2023

Saved mass matrix

Related topics