I just noticed that when use
brm() to fit the same single-level linear model (Gaussian with an identity link) to the same data and identical function call, with
seed = 2023 repeatedly, the results are identical every time when
backend = "rstan", threads = NULL, whereas with
backend = "cmdstanr", threads = threading(4) the results will differ between the identical calls.
And when I look at mymodel$fit@stan_args, the seed is reported correctly with both backends. But “cmdstanr” seems to be ignoring it nonetheless.
you have to give static=TRUE as argument to threading in order to get a (likely) exactly reproducible run… better do not use threading at all if you really need for sure exact reproducibility.
While I don’t need threading with a simple Gaussian single-level model, I do need it with many of the complex multilevel models I frequently fit. This is because of the significant speed increases that it provides. And thanks to your solution, I can now have my cake (=more speed) and eat it too (=reproducibility)!
The static option is there to enable exact reproducibility indeed. It’s just that it even depends on compiler flags for it to work ok. So don’t use the fast-math option on Intel CPUs is what I recall. As a rule of thumb, do test things and be less aggressive with compiler flags. If you want to read up, then the doc of the Intel TBB is a resource to consult. It’s really hard to guarantee things with threading whenever you throw in all optimisations.
(but… yeah… with threading things get in a number of cases considerably faster)