Multithreading on Windows with PyStan 2.18+


Warning: These instructions can ruin your Python environment.

If anyone feels adventurous, I made simple instructions to enable multithreading on Windows with clang-cl + MSVC toolchain.

These instructions need admin rights for MSCV toolchain.

Good luck!

After these steps, you can follow multithreading instructions for PyStan (Ignore Windows comment)


The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter. All the GIL does is make sure only one thread is executing Python code at a time; control still switches between threads. What the GIL prevents then, is making use of more than one CPU core or separate CPUs to run threads in parallel.

Python threading is great for creating a responsive GUI, or for handling multiple short web requests where I/O is the bottleneck more than the Python code. It is not suitable for parallelizing computationally intensive Python code, stick to the multiprocessing module for such tasks or delegate to a dedicated external library. For actual parallelization in Python, you should use the multiprocessing module to fork multiple processes that execute in parallel (due to the global interpreter lock, Python threads provide interleaving, but they are in fact executed serially, not in parallel, and are only useful when interleaving I/O operations). However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.

Thanks for the python threading information.

Stan threading runs in C++ so Python threading and gil don’t handle it.

1 Like

Just curious: why didn’t PyStan choose to do threading in Python? Having models be pickle-able is great, and it’s just one further step to spawn subprocesses and pass them the model…

Before 2.18, Stan was not thread-safe.

PyStan3 (actually httpstan) uses threading for parallel chains.

OK. Perhaps OT, but will some variant of PyStan continue which interfaces to in-process compiled Stan code? It’s nice not having to serialize+deserialize for the really big data sets.

Current PyStan should be replaced by PyStan3 at some point and the old PyStan is going to be deprecated.

For big datasets even the current PyStan is not optimal, because the data is copied (I think it is?) (multiple times?) inside the Python+Cython link.

We should check if there is a way to send data from python process straight to Stan process.

But even copying is fairly fast compared to, say, np.savetxt(fname, big_array); np.loadtxt(fname). If httpstan standardizes a binary data exchange format then it’d be similar (at least on local machine).

That is true. I’m not sure if np.savetxt is the best benchmark for save speeds.

PyStan3 uses ujson for serialization, but there have been discussion going for binary format too.

if copying your data per chain is a problem for… then I am not sure if you will be able to fit that model in finite time anyways with full Bayes.

I’m mainly trying to avoid parsing large CSV files which happen with large latent state models.