Variation in elapsed time of parallel chains and best practices for computing expectations from multiple chains

avehtari · October 28, 2018, 7:03pm

Only the number of iterations, warmup iterations, initial step size, adapt_delta and other parameters controlling or initializing adaptation are shared. During the adaptation each chain will get different step size and mass matrix.

This will not solve the convergence problem. but will get some speedup by using

y1a ~ neg_binomial_2_log(f1 + logNormFact1a, alphaVector)
...

where logNormFact1a = log(normFact1a)

Do you have reference for that “variance parameter for each other channel,”?

alpha ~ uniform(0, 1e9);

Are you sure uniform is good for alpha? Maybe hierarchical prior would be better?

Negative-binomial can produce multi-modal posterior with certain values of alphas, so it’s also possible that this would explain the behavior.

Depending on how the distance between the observations is distributed

ell ~ gamma(2,2);

might also be too vague.

I recommend to look at pair plots of ell, kfDiag, kfTril, alpha and lp__ to learn about possible multimodality or funnels.

Topic		Replies	Views
Huge contrast in sampling time among chains for the same model? RStan	4	805	November 5, 2018
Inconsistent chain speed - does this give a clue about the problem? Algorithms optimization	10	4556	July 20, 2018
One MCMC chain not moving Modeling cognitive-science	7	2611	November 11, 2020
Code takes too long to run, despite small dataset size Modeling	7	680	April 11, 2022
Multi-chain performance related to Local Setup General	5	1097	January 12, 2018

Variation in elapsed time of parallel chains and best practices for computing expectations from multiple chains

Related topics