Running multiple chains on different computers

Are there any known pitfalls or best practices for ensuring independent RNG behavior in completely separate instances of Stan running on different computers (some of which might be virtual machines)? Specific instructions/code greatly appreciated, especially if it’s something I have to do outside of R/cmdstanR.

Motivation:
I am using within-chain parallelization to fit a model with very long compute time. Because I have access to multiple computers, I think the best way to use the available resources is to run different chains on different computers. I am using cmdstan via cmdstanR, and I have access to cores on Mac, Linux, and Windows machines.

Is it uncontroversially ok to select different arbitrary seeds for the different chains and assume that the behavior is independent?

Yes. Indeed, it’s tough to get precise replicability when you want it, so simply using a different seed on the different computers shouldn’t pose any risk of non-independence.

2 Likes