I apologize if this topic happens to be a bit generic/off-topic, but since there are a few precedents under a search for “hardware”, which are a few years old, during which time the market shifted towards ARM, GPU/TPU/NPU/XPUs, etc, I thought it would be legitimate to ask a few questions regarding recommended hardware – I am particularly interested in issues of CPU architecture and GPUs, but maybe that’s just me.
I have personal computers (including Mac) in mind, but feel free to comment on HPC clusters and other set ups (I’m personally interested in the RaspberryPi, but that’s not the focus here) to make this more general. Cutting to the chase what would be the recommendations, if any, about the following components:
CPUs: number of cores, cache, is Intel vs AMD and ARM vs x86 architecture a thing? Other than between-chain parallelization, does this affect the parallelization?
XPUs: I’m guessing thing’s like TPUs and NPUs are mostly marketing, so GPUs would be what’s really accessible to consumers, so the main question becomes: does Stan benefit out-of-the-box from having a good consumer-level GPU available?
Some additional/potential issues
Is it still a hassle to install Stan/interfaces on Windows?
Any specific recommendations for Linux distributions?
Write in of anything important I may be unaware of
I’m assuming RAM and SSD are non-issues since HDD is dead. I guess there are millions of choices when setting up hardware to run Stan or any other numeric-intensive tasks, but I’m hoping this won’t get arcanely specific.
Disclaimer: No, I am not necessarily buying a new laptop myself in the near future (I have several functioning ones lying around); however, I will likely have to specify and recommend setups for people acquiring different kinds of setups, and I would like to know how they can be optimized for software I am familiar with, especially Stan.
Stan is still predominantly CPU-oriented. That’s where most of our development time is spent, and we remain competitive with a lot of the bigger names like JAX, TfP, etc, in that domain. We’re always discussing more GPU support, for the reasons you note, but it is currently only really useful in Stan if your density has specific structures and is large enough that the overhead of shipping back and forth between the GPU and CPU is worth it.
That said, Stan is not the only show in town, so it is worth considering the broader landscape. This overview from this month is quite good:
SSD is useful for streaming output. If you’re on a slow cluster file system, streaming draws out can be a bottleneck. It can help to stream them out locally, then push the whole file to persistent storage.
With multiple cores, the bottleneck becomes memory between RAM and CPU caches and CPU. ARM is much better architected for the way Stan accesses memory than AMD/Intel. You’ll be able to use more cores without bottlenecking memory.
No idea what an NPU is, but TPUs are available in Google Cloud. We have no code targeting TPU, and as Brian says, only specialized GPU support so far.
Yes, Windows is still a pain in that it’s just slightly different enough than Mac OS X/Linux to make it a pain to support for everything from MPI to multi-threading to C++ libraries. None of our developers work on Windows, so it’s not something we optimize for a lot.
I use Mac OS X, so no idea about Linux distributions.
I think in the future, computing is going to move more toward the cloud and away from PCs.
If you’re buying a notebook now to run Stan, I think the Mac OS X ARM machines are your best bet.
Thanks for the reference, too, seems like a useful read. Of course different pairs of hardware/software will be more or less optimized, but I did want to get an idea of where Stan stands given these more or less recent shifts that seemed to be driven by quite a but if of AI-hype.
That’s useful and interesting to know.
I’m not sure NPUs are an actual separate thing, but I wasn’t sure if any of those happened to become the next best thing, but I guess CPUs are still the main bottleneck to run relatively intensive tasks on personal computers.
This thread is nice to know for those of us building/purchasing PC’s with Stan in mind – thanks.
Is it still a hassle to install Stan/interfaces on Windows?
Yep. Doable, but obnoxious. (Thankfully much less obnoxious due to the efforts of several people on this forum.) I’ve generally had way fewer ‘weird’ issues on Linux than Windows.
Any specific recommendations for Linux distributions?
I’m a GNU-bie, but here’s what I’ve witnessed. Distro is perhaps not terribly critical. I will say that I’ve had occasional headaches with R and RStudio on Fedora, giving me issues above my skill-level. I continue to monitor the R-SIG-Fedora list and some threads tell me there are still oddities. (I still like Fedora a lot though.) Ubuntu has been great for my use case, very simple to pick it up.
@Bob_Carpenter With multiple cores, the bottleneck becomes memory between RAM and CPU caches and CPU. ARM is much better architected for the way Stan accesses memory than AMD/Intel. You’ll be able to use more cores without bottlenecking memory.