Transformed data RNG varies by chain?


#1

I believe we previously decided that there should be a different (pseudo) RNG in each chain for transformed data. The problem with this approach is that it gives us no way to monitor convergence. I now think we should not modify the current behavior, which is to use the same RNG in each chain.

Further, I want to make sure that the RNG that is used in the constructor of the generated model class is a reference to the one that’s used later.

Allen suggested in a GitHub issue https://github.com/stan-dev/stan/issues/2241 that we should not make the interfaces do all this monkeying with the RNG, but instead just pass in a seed. I don’t really mind one way or another how it’s done, but the same RNG that’s going to be used for the draws should be used to instantiate the model.

The only way I can see this working is if we use the seed for the model class as is, then advance it based on chain ID. I don’t think we want to reuse the same draws from the RNG, so maybe we can have effectively a minimum chain ID of 1 and advance from the initial seed for chain 1 and other chains beyond what was used in the constructor for the model class.


#2

In pseudocode this means the interfaces would do:

Model model = Model(…, int seed)
Chain chain1 = Chain(…, Model model, int id = 1 )

Chain chain10 = Chain(…, Model model, int id = 10 )

Internally, each chain then:

  1. uses RNG from the model for the transformed data without advancing the RNG
  2. advances the RNG by chain id
  3. uses the advanced RNG for the rest of the model

The model needs a method that returns a reference to its RNG.

That makes sense to me, is this right?


#3

Turns out you don’t need a reference. The model will take the seed and use it from offset 0 and all other chain IDs will advance it from the start, so there won’t be overlap and the model doesn’t need to save its RNG.

This is good, because I would very much like to preserve the property that (a) models aren’t aware of their chain ID and (b) models are stateless and immutable. We may need to relax (b) to allow for algorithms that change data as they go.


#4

I see, this sounds good to me!