Multiple chains with user specified initialization

Hi,

In the cmdStan guide it says The id value makes sure that a non-overlapping set of random numbers are used for each chain. Which is why I use the id setting when running multiple chains.

Recently I have begun using the init option that allows the user to input and initial guess/starting point.

But the statistics of my four chains throughout the warm-up period are suspiciously similar. Is this the correct way to go about using init?

Additional question: Can I use a different seed for each chain and ignore id?

Thanks in advance!

Regardless of id or random seed, data issued through init will be used in every chain.

One can definitely issue different random seed in different chains, that’s basically what happens when you run the same cmdstan job one after another (but remember to use output file option).

id strides the random number generator seeded by the seed argument to ensure that each resulting sequences of pseudo random numbers are independent. While it’s unlikely that different seeds will lead to sequences of pseudo random numbers that collapse onto each other, the recommended use is to set an overall seed and use id = 1, 2, 3, ... for separate chains/runs/etc.

As @yizhang notes if an init file is provided then the chains all start at the same point and the different pseudo random number sequences just influence the evolution of each Markov chain from that initial point. To ensure that the diagnostics are as informative as possible one should ideally use separate init files for each chain.

Thank you for the answer.

Right now I am using a simple least squares solution as my starting point for all chains. I’ll try to introduce some jitter and create unique starting points.

1 Like