Is there a preferred way to generate random seeds for rstan in R?

(I’ve searched for an answer but I can’t find one I understand. Sorry!)

I’m running a pile of sampling calls in a big loop in my R code. I’ve used set.seed to set the R session seed, but I also need to control the seed for sampling. Is there a sensible way to get it? I had expected something like get.seed() and increment.seed() but that doesn’t seem to exist and .Random.seed is a big vector and not the single integer that sampling requires.

What’s the accepted way to do this?

If you’re looking to set the seed when calling RStan via \texttt{Stan()}, then you could use the “seed” option. A generic call (https://mc-stan.org/rstan/reference/stan.html) is:

stan(file, model_name = "anon_model", model_code = "", 
  fit = NA, data = list(), pars = NA, chains = 4,
  iter = 2000, warmup = floor(iter/2), thin = 1, 
  init = "random", seed = sample.int(.Machine$integer.max, 1)...)

So simply set seed=0, say, to override the default random seed and fix your own.

If I understand you correctly, you want to specify the seed argument to sampling inside a loop in such a way that the first draw of the first NUTS iteration of the i-th iteration of the R loop is what it would have been had there been one more NUTS iteration on the i - 1-th iteration of the R loop? That is easier said that done and it took me several minutes to write that last sentence.

I think what you can do, which is not exactly that but is more or less valid-ish, is to increment the chain_id argument to sampling and specify the seed argument to be the same integer every time you call sampling. In other words, something like

for (s in 1:S) {
  post <- sampling(stanmodel, data = data_list, 
                              seed = 42, chain_id = 4 * (s - 1) + 1)
}

doing so is supposed to have the effect of making the pseudo-random numbers approximately independent of each other when the PRNG is initialized for each chain:

1 Like

Thanks Ben!