Replication issue between macOS and Windows

lg01 · November 11, 2024, 12:51pm

Hello everyone,

I am currently facing an issue with the replicability of my estimation results. I initially estimated the parameters of my model on Windows OS, but I noticed slight differences when attempting to replicate the same estimation on macOS. I am using CmdStanR for these estimations, and I would appreciate any advice on ensuring consistent results across operating systems.

WardBrian · November 11, 2024, 2:35pm

Are you getting different statistical results (e.g. mean estimates), or only different draws?

Exact numerical reproducibility (e.g. the same draws in the same order) across operating systems is not something Stan seeks to guarantee. It’s possible you could even observe differences on the same operating system on two different pieces of hardware.

But if you’re getting truly different posteriors, that would be interesting to explore why

lg01 · November 11, 2024, 3:07pm

Thank you for your reply. I am getting different statistical results.

WardBrian · November 11, 2024, 3:13pm

Are you able to share your code?

Bob_Carpenter · November 11, 2024, 11:06pm

If you can’t share your code, can you say what you mean by “slight differences”? Are the differences in mean estimates with MCMC standard error?

MCMC is random and results can vary across runs. Even with the same seed, floating point behaves differently on different computers, under different C++ compilers and optimization levels, etc.

lg01 · November 12, 2024, 7:07am

Unfortunately I can’t share my code. But yes, I observe differences in the statistics of the posterior distribution of the parameters, including mean estimates.

jsocolar · November 12, 2024, 1:56pm

How large are the differences, and how large are the monte carlo standard errors?

One useful check that might be clarifying is the following:

Run a sizeable number of chains (say six or ten) on each OS. Then, move the output csvs from one computer to the other. Then, load both sets of csvs on one computer, and compute convergence diagnostics 3 times:

Over the csvs from one OS
Over the csvs from the other OS
Over all the csvs combined into one object.

If R-hat and other diagnostics are not much worse in 3 than in 1 or 2, then it’s likely that nothing unexpected is happening.

Topic		Replies	Views
Rstan : Could the output of the stan model (each post warmup iteration draws) be different between linux and windows? Modeling rstan	2	328	June 30, 2023
Prior-predictive samples (sometimes) affected by operating system? RStan	4	355	May 11, 2021
Question about the Reproducibility of Stan Results Algorithms cmdstan , cmdstanr	6	1664	January 10, 2022
Same model, same data, same seed, different computers, different number of divergences General	2	692	September 6, 2023
Same code (with the same seed) but different results on different platforms? Why? General rstan	2	1540	August 29, 2021

Replication issue between macOS and Windows

Related topics