Understanding the fit report header and finding warmups

maydin · September 25, 2019, 5:39pm

This question is not about a specific model, therefore I’m not posting the model here. I’m confused about some of the details of the reports and results from Stan fits in general.

This is a report header from a recent Stan fit, which was set up to run 4 chains, iterations=50000, warmup=5000:
‘’’ Stan report
4 chains: each with iter=(50000,50000,50000,50000); warmup=(0,0,0,0); thin=(1,1,1,1); 200000 iterations saved.
Warmup took (1192, 1168, 1138, 1155) seconds, 1.3 hours total
Sampling took (10861, 8264, 10847, 6912) seconds, 10 hours total
‘’’
I have a few questions:

The report says 200k iterations were saved, but I see only 50k iterations in the results. Are the results from different chains averaged, or something else going on?
Why does the report say warmup=0? I passed on 5k warmups when I started the fit and Stan didn’t complain (i.e. proceeded to run and completed the fit). And Stan clearly spent about 10% of its time on the warmup, which suggests it carried out 5k warmups for 50k iterations as I asked.
Are warmup iterations saved anywhere? Is it possible to see how the spin-up (this may not be the right phrasing, sorry) distributions look? This is important because I do not see a clear path to determining the proper number of warmups for a given model unless one can compare the warmup results to the “real” iterations.

Thank you

ahartikainen · September 25, 2019, 6:20pm

Hi, what interface do you use? And how do you call the sampling?

maydin · September 25, 2019, 6:38pm

Hi,

I’m using CmdStan via Matlab. Just upgraded to CmdStan 2.20 from 2.17, but this had no impact on the report headers.

And this is the code that initiates Stan from Matlab:
dates = 1907:2015;
niter = 50000; % no. of MC simulations
nwarmup = 5000; % no. of warmups for STAN
Matm = size(dates,2); % # of elements in AHist
data = struct(‘Matm’,Matm,‘N13’,nfdat13,‘M13’,nYears13,‘fDat13’,…
SUM13.cGood,‘fDaterr13’,SUM13.eGood,‘fAgeD13’,fAgeD13,‘N06noaa’,…
nfdat06noaa,‘M06noaa’,nYears06noaa,‘fDat06noaa’,SUM06.cGoodnoaa,…
‘fDaterr06noaa’,SUM06.eGoodnoaa,‘fAgeD06noaa’,fAgeD06noaa,‘N06uci’,…
nfdat06uci,‘M06uci’,nYears06uci,‘fDat06uci’,SUM06.cGooduci,…
‘fDaterr06uci’,SUM06.eGooduci,‘fAgeD06uci’,fAgeD06uci,‘N15’,nfdatR15,…
‘M15’,nYearsR15,‘fDat15’,REN15.cGood,‘fDaterr15’,REN15.eGood,‘fAgeD15’,fAgeDR15);
mod = StanModel(‘file’,‘FirnOptimize_COS_SumRen_noAR1_AgeDerr.stan’); %pre-compile
mod.compile();
params = struct(‘data’,data,‘chains’,4,‘warmup’,nwarmup,‘iter’,niter);
fit = mod.sampling(params); %fit model
fit.block();
print(fit);

ahartikainen · September 26, 2019, 8:55pm

@avehtari do you know about the Matlab interface?

maydin · September 26, 2019, 10:30pm

I inferring from your response that I should see 200k iterations in my results when I start 4 chains with iterations=50k. Am I correct? And if I use R (or another interface), this would be the case?

Where are the warmups saved, assuming the interface is not a problem?

avehtari · September 27, 2019, 6:29am

I’ve stopped using Matlab some years ago, but IIRC Matlab interface doesn’t return warmup (I never needed it).

Yes. 50k per chain is probably 50 times more than what you need. Other samplers than Stan’s dynamic HMC may need 50k iterations, but Stan’s dynamic HMC is much more efficient, and I recommend to start with 1k iterations per chain.

Not saved when using Matlab interface.

martinmodrak · September 27, 2019, 2:46pm

Unfortunately MatlabStan is no longer actively developed, so it is well possible there is a bug there or something broke during upgrades to later versions of core Stan :-(

maydin · September 27, 2019, 5:50pm

Thanks. I’m gonna have to look into switching to something else in the near future. Is R the most commonly used and recommended interface?

maydin · September 27, 2019, 5:59pm

I need clarification on one more point if you can please. I seem to be getting results from only one chain regardless of how many chains I initiate, would you agree? Is there a disadvantage or a downside to starting only one chain then, aside from having to spend more time to get the same number of iterations. In my case, this would make no difference of course since I cannot see those iterations anyway.

mcol · September 27, 2019, 9:42pm

Either R or python are well supported, both in code and in this forum.

Using the R (I guess the same applies to python) interface, you get draws from all the chains, which you can extract keeping the chain information intact, or all together.

Launching more chains allows you to better explore the posterior distribution, and thus discovering potentially problematic areas that a single chain may not reach. Besides that, it helps to produce diagnostics such as R-hat. In any case, if you have a few cores and enough memory, there’s no much disadvantage in running many chains (say 4) in parallel, so in particular during the development phase, that’s the safest approach.

maydin · September 28, 2019, 1:26am

Is starting four consecutive single chain searches in series a viable workaround while I try to migrate my work to R or Python? I can combine the results externally after the runs are complete. Aside from loosing R_hat, which I get from my four chain runs (just cannot see the sims), and the obvious time expense, would that be the same as running four in parallel at the same time?

mcol · September 28, 2019, 9:05am

As long as each chain starts from a different point, then it will be fine. But note that from the output you showed above, it seems that it’s already running 4 chains, only that it doesn’t separate them out.

maydin · September 28, 2019, 5:54pm

Yes, it does run four chains, but I am struggling to find the results. In Matlab, the returned “fit” file has only one chain saved. For example, if I start 4 chains with 5000 iterations, I get back 5000 simulation in all my posteriors and that’s it. I think I found a better workaround than running single chain searches in series though. At the end of each search, I also get back 4 output files with .csv extensions. I looked into them and they look like the outputs from 4 chains. I can write my own code in Matlab to read the data from the .csv files. If someone can confirm these .csv files are indeed the outputs form different chains, that would be great. Also, I’m not sure if these files are created from C++ or from Matlab after the run is over. I will feel more comfortable using them if they come from Stan directly. Thanks for all the input guys, I feel like I understand things much better now and realize some of the problems I’ve been seeing in repeatability of Stan searches probably had to do with the fact that I’ve been looking at results from a single chain the whole time.

Topic		Replies	Views
Iterations and warm ups in Stan General	7	6927	November 22, 2019
Model fitted well for 500 warmup, and failed to fit for 1,000 warmup Modeling rstan , fitting-issues	4	509	April 21, 2023
Stan and cmdstan running slow Modeling	8	725	March 24, 2022
Stan warm-up/initialization phase CmdStan	1	508	April 30, 2020
Getting run time for different phases of warmup Interfaces	2	303	February 18, 2021

Understanding the fit report header and finding warmups

Related topics