Simulation-based calibration case study with RStan

Thank you for letting me know. I learn a lot form you many times.

Thanks. I should’ve run it rather than guessing. I’m very bad at guessing what R functions are going to do.

Is there a reason not to use 0:4 instead of these highly spread out numbers? Do seeds that are closer somehow windup generating draws that are closer? My ignorance of PRNGs runs deep!

1 Like

If your data block has something like

transformed data {
  real mu_ = std_normal_rng();
  real sigma_ = exponential_rng();
  vector[N] y;
  for (n in 1:N) y[n] = normal_rng(mu_, sigma_);
}

Then my intuition is that if you started one chain with a seed of 0 and another with a seed of 1, then the y realizations would be the same except offset by one index position. So, that is why I went with maximal spacing of the seeds.

1 Like

Generally pseudo random number generators don’t work like that. That would indicate that the sequence would repeat after .Machine$integer.max steps. The better prngs have much more states than .Machine$integer.max and integer seeds represent only small portion of those states and those states are very far away. Stan uses ecuyer1988, which has length of cycle 2^{61} and .Machine$integer.max is 2^{31}, so there are plenty of pseudo random numbers between each seed.

Mathworks did some carefully analysis years ago choosing their default prng, which is now Mersenne twister. Mersenne twister mt19937 is also what boost recommends as the first choice. It has length of cycle 2^{19937}-1. Use of ecuyer1988 in Stan has been questioned before Quality of Pseudo-Random Number Generators, but for some reason only other than boost library solutions (e.g. mt19937) were considered https://github.com/stan-dev/stan/issues/403
mt19937 would be also 13 times faster than ecuyer1988, but I guess speed has not been a problem.

1 Like

Mersenne is the typical default for most implementations for the reasons that @avehtari mentions, despite its relatively large state.

My understanding of why we went with ecuyer1988 instead of the twister is that it can be strided – offsets in the seed yield guarantees on how independent the resulting sequences appear. This is why we have chain_id – the actual seed used for each chain is seed + chain_id so that each chain will have independent PRNGs, at least within the first cycle.

For something like SBC we only need seed variation amongst chains – the changing data will ensure variation in the posterior samples from iteration to iteration.

If we wanted to consider moving beyond ecuyer1998 then it would be worthwhile to look into http://www.pcg-random.org which allows for PRNGs that can be strided (i.e. that admit multiple streams) with relatively small state and fast execution and configurable periods.

2 Likes

That was the reasoning, specifically because @bgoodri was concerned about using proper parallel RNGs (ones that can be strided ahead efficiently).

Neither speed nor randomness seem to be a problem with what we have. The other candidate that @bgoodri was recommending was

The only reason we never got around to adding it was that it was an extra dependency and the one we have seems to work fine. If we do need to change RNGs, it won’t be a big deal as it has a pretty simple interaction with the rest of the code.