the thread you have quoted is very very interesting, thanks!
It seems that there is a general consensus against running Stan on Windows; many people report severe degradation in performance, mostly blaming either the inefficient Windows compiler(s) or that Stan is developed for Linux first and then ported to Windows. So, for the sake of a general discussion about Virtualized Environments, let’s put Stan-on-Windows out of the equation, and consider Stan-on-Linux only (virtualized or not).
Stan should (as far as I know, correct me if I’m wrong) spend 99.9% of the time on the CPU cores (crunching numbers and accessing the RAM), doing very few syscalls (I/O mostly). Syscalls are the place where performance issues usually arise in virtualized environments.
Even during Model compilation by the C++ compiler, when many files are accessed, I would expect the CPU+RAM to be the bottleneck, while trying to apply all those expensive numerical optimizations (loop unrolling, function inlining, etc etc).
During MCMC sampling - after the initial data loading from the filesystem, I can imagine syscalls rarely done only
- to get more memory from the OS (e.g. when appending new MCMC samples to the chain output buffer)
- to get the system time for measuring elapsed times
- to output debug messages
I’m wondering if the above description is accurate: that is what I would “theoretically” naively expect, but I’m not a specialist in numerical programming, and I’m far from being an expert in Stan as well or virtualization. There are probably other important aspects that I have overlooked, and that arise in practice.
Again, thanks in advance for any contribution :)