Large Cmdstan performance differences Windows vs. Linux

Yes, I planning to do that in future, this was first to get things running

not good. what does CmdStan do? you should be able to get the exact command and run it by hand.

Which versions are you referring to here? I think I recall that macOS did catch up in speed once I introduced the TBB malloc library under macOS. That gave macOS a serious speed bump. Linux did not benefit from this which suggests that the memory management under Linux is better from the start and the TBB made the difference on macOS.

The TBB was introduced in 2.21.

I’m testing agains 2.19 and 2.22.1

Results are here (I have not yet gathered them to a plot; csv files). I plan to add another job that gathers these to a plot and is served on the repo.

e.g.
pystan results
cmdstanpy results

And when these links get too old, you can access artifacts from here

edit. Also, testing against 1 seed is not really a good idea

Maybe we can use these results to discourage people from using Windows and eventually deprecate the platform altogether. (One has to keep one’s dreams alive!)

We want to keep the conditions the same. Given that numerics are not identical across OS/CPU/compiler/setting, results can diverge after some number of iterations, even with the same seed. What we really want to test is basic iteration time to do fixed iterations. So HMC is probably better than NUTS as it controls the number of log density and gradient evals per iteration. Though it doesn’t test everything and it could be the MCMC algorithm slowing things down.

4 Likes

Actually, I procrastinated a bit and found the solution (well, at least it looks promising).

WSL (Windows Subsystem for Linux) https://docs.microsoft.com/en-us/windows/wsl/install-win10
Its available in the Microsoft store. Easy to install, took me 5 minutes, no hiccups.

Had to install make and g++ with
sudo apt install make g++

No mingw32-make, no TBB path stuff, just works! Sweet. You can edit all files normally. C: is under /mnt/c

And the times for your model @andrjohns :
Native Windows + RTools 4.0: 400s
WSL: 140s !!

I would encourage anyone running Windows that use cmdstan, cmdstanpy or run rstan or cmdstanr via the console to try it out. Easy to try & doesnt break anything.

Rstudio doesnt work with WSL directly yet, but you can start a rstudio cloud in WSL and use it with the browser. Though that sounds a bit too meta for my blood: https://medium.com/lead-and-paper/how-to-use-rstudio-server-for-ubuntu-on-windows-10-a7aeee661a5d

There is also WSL2 that needs a latest build of Win10 that also runs GPUs and all.

8 Likes

Cool! Super weird.

sounds great!

Its actually not that weird. Microsoft basically admitted (good for them and there is a first for everything) their powershells and whatnot are bad. Cygwin and other third-party stuff is more or less bad also.

Their primary target is web devs but this seems to work out for us too.

1 Like

Wsl is great, so is docker. I do some dev work with wsl.

Still it needs some work from user.

I would think httpstan run on wsl / docker would work great.

Jupyter lab / notebook can also used from wsl/docker which is great.

2 Likes

WSL (Windows Subsystem for Linux) https://docs.microsoft.com/en-us/windows/wsl/install-win10
Its available in the Microsoft store. Easy to install, took me 5 minutes, no hiccups.

I currently switched to Windows (from a Mac) because of WSL (and my employer made it harder and harder for my Mac to work).

I have found RStan works great through WSL. My current work flow is to build the a model locally with WSL and then push it off to a Linux server and get my outputs back in 1 to 10 hours when the models are done running. For anyone else, using this approach, I suggest putting your Stan models into an R Package to avoid recompiling.

And, @Bob_Carpenter, I wish you could convince people like my employer to move away from Windows! Although, MS is moving closer to Linux and it seems macOS is moving farther from Unix.

1 Like

Hmm. When I tried out wsl last year I didn’t notice performance differences, and you can get rstudio working via something like vcxsrv. I will check again re performance, but interested if others find similar or this model is for some reason particularly awkward on windows…

1 Like

I understand you and @Richard_Erickson are mostly joking, but please bear in mind that many users are on Windows for a wide range of reasons and I think Windows shaming is not very helpful for anything - I’ve always associated it with gatekeeping and the “REAL programmers do XY” gimmick. (I have a conflict of interest here as I am primarily on Windows and it suits me just well, but I understand why people may make different choices)

2 Likes

I’m serious about wishing my employer would move away from Windows.

Windows lacks a good, native compilers. It’s a well known problem and hence why RTools exists (and, more recently, WSL). More broadly, the problem also makes general development hard on Windows. To get RStan working, I had to have my local IT spend about an hour helping me reinstall R and RTools. Even after this, I sometimes still have to reconfigure R when I try to use RStan.

macOS requires the installation of XCode and then RStan works well after the initial setup.

Linux, simply requires a apt-get install r-cran-rstan and boom. RStan works with a single line of code.

The last reason is why I would recommend somebody having trouble with RStan to use WSL, which also allows the easy apt-get install option for RStan. The downside to using WSL to run RStan is that outputs need to be save open opened with Windows R to plot, but is otherwise a decent hack around Windows.

I wasn’t serious! I know it’s here to stay. If Mac was more business friendly and didn’t pull the rug out from under users ever release, they might get some business traction. As is, MS is the only company that’s actually respectful of business users, so they still have them all.

1 Like

I feel like this has very little to do with Windows, and a lot to do with the difficulty packaging RStan. Recent versions of Visual Studio produce fairly performant code, and it is standards compliant. Reminder: the AAA games industry produces feats of performance engineering every year on Windows, and aren’t blaming the compiler. Instead, when you say lack of native compilers, you really mean software is developed first for Linux using GCC, and is not written in a cross-platform way.

It is not impossible to build Stan with Visual Studio, I was doing so a few months ago. The real issue is dealing with downstream library packagers who have to deal with RTools and CRAN restrictions.

Microsoft produces their own R distribution. It’s what I would use on Windows if I had to deal with security-conscious IT. I’m not sure what your example is supposed to mean regarding ease of installation - on the company Linux server, mere developers don’t have admin privileges to install system packages either.

2 Likes

But isn’t Visual Studio a $$$ software? Or does MS offer by now an open-source counterpart which is free?

I think you can download some visual studio “free”, but we are now talking about C++ compilers, right?

Yup…and the supposedly good vc compilers are not free to my knowledge.

Visual Studio is free for open-source software. Microsoft even offers free Windows virtual machines for development, the only restriction is the usage license must be renewed every 3 months.