Huge contrast in sampling time among chains for the same model?

Ziyi_Chen · November 2, 2018, 9:42pm

I used the following code to run 3 chains in parallel, and their progress is printed to a website, as shown in the snapshots below.
The 20000-100 iterations of sampling along (not including warmup) takes 2438.24s for chain 2, but 76191.3s for chain 3! The chain 1 remains not finished yet.

All the 3 chains run on the same model.Why this can happen?

Thank you.

GMM.stan.model.noZ=stan_model(file=“GMMdesign20181030_standardize_noZ.stan”,model_name=“GMM_standardize_noZ”,auto_write=T)

GMM.stan.fit.noZ=sampling(object=GMM.stan.model.noZ,data=list(
V=length(x.prewhiten),C=Ck.max,s=x.prewhiten,
alpha_w=delta.w,sd_prior_mu_prime=sqrt(1/gamma.mu),
shape_gamma_prime=kappa.sigma,scale_gamma_prime=xi.sigma),
init=init.list,iter=20000,warmup=100,
chains = 3,cores=3,control=list(adapt_delta=0.8),
verbose=T,open_progress=F)

caesoma · November 2, 2018, 10:06pm

I had that happen with a large gaussian process model and the default adapt_delta=0.8 parameter, the biggest problem being that the posteriors did not to mix properly. And they would take from 1 to 10 days.
From the discussion here I gathered that the value was too low and therefore the chains were not equally tuned after the burn-in period (or something along those lines, I’m not familiar with all the details of the NUTS tuning). Maybe you can check if the different chains are mixing as expected and the only difference in the chains is the time it takes.

I increased the value to adapt_delta=0.9 and mixing got better (may still need to increase it to 0.99 and/or run a longer chain) and the variation in between-chain elapsed times decrease to a range of something like 2-3 days (although I ended up changing other parts of the model so I can’t compare directly).
Maybe try that first, and if you get elapsed times in between you’ll know in a few hours.

Ziyi_Chen · November 2, 2018, 11:29pm

Thank you.
Yes, it mixes really poorly by Gelman Rubin statistic

Ziyi_Chen · November 3, 2018, 2:14am

I increased adapt_delta to 0.99, and it is sometimes very fast and sometimes very slow, even in sampling stage after warmup.

caesoma · November 5, 2018, 12:03am

Did it reduce the variation, at least? Maybe it’s a deeper problem, even HMC can suffer from that if the posterior has a weird shape (like the apparently infamous “funnel”). That may be fixed if you are able to constrain the parameter space through a meaningful specification of the priors (assuming there are no identifiability issues in the likelihood that remain despite that).

If you can describe the model itself and post the stan model code other people here may be able to help (I don’t always find stan code intuitive to understand, so I’m not sure I have the intuition to given advice based on that, but depending on the type of model and the actual mathematical description maybe I could as well).

Topic		Replies	Views
One chain considerably slower than all others General fitting-issues	3	1487	May 15, 2023
Last chain hangs up during fitting Modeling fitting-issues	5	2786	July 30, 2019
Multicore Speedups are different between models Algorithms	25	4646	September 11, 2017
Variation in elapsed time of parallel chains and best practices for computing expectations from multiple chains Algorithms mcmc	18	3137	October 29, 2018
Inconsistent chain speed - does this give a clue about the problem? Algorithms optimization	10	4615	July 20, 2018

Huge contrast in sampling time among chains for the same model?

Related topics