PyStan throws error when running chains in parallel (n_jobs > 1)

pbhambhani · August 20, 2020, 5:27am

I recently installed PyStan from conda-forge in a fresh conda environment on MacOS Mojave 10.14.5. I’m having trouble running multiple chains in parallel (i.e. with n_jobs > 1). Stan seems to not be able to find the compiled model when n_jobs > 1, but it has no such problems when n_jobs = 1. The error is as follows:

Process SpawnPoolWorker-4:
Traceback (most recent call last):
  File "/Users/pbhambhani/misc/ljmu/summer_project/pystan_env/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/pbhambhani/misc/ljmu/summer_project/pystan_env/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/pbhambhani/misc/ljmu/summer_project/pystan_env/lib/python3.8/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/Users/pbhambhani/misc/ljmu/summer_project/pystan_env/lib/python3.8/multiprocessing/queues.py", line 358, in get
    return _ForkingPickler.loads(res)
ModuleNotFoundError: No module named 'stanfit4anon_model_482f7e78a7c2d6f09648bba041f6f372_1939374203243207487'

Other information:
Python version: 3.8.5
PyStan version: 2.19.1.1
Compiler: clang 9.0.1 - I think this is installed by PyStan, and is different from the default clang version on my system. The latter reads Apple LLVM version 10.0.1 (clang-1001.0.46.4).

I noticed a couple of warnings related to linking. I am not a c++ expert so I’m not sure if they’re related to this issue I’m facing, but posting them here just in case.

clang-9: warning: -Wl,-export_dynamic: 'linker' input unused [-Wunused-command-line-argument]

ld: warning: -pie being ignored. It is only used when linking a main executable

Finally FWIW, I also have scalastan installed on my machine for a different project, and that seems to have no trouble running chains in parallel. That one seems to use cmdstan 2.19.1.

Any help is appreciated. Thanks!

ahartikainen · August 20, 2020, 6:19am

Do you run your python in a script?

Try to run your code in __name__ == "__main__" block

import pystan

if __name__ == "__main__":
    sm = pystan.StanModel(...)
    fit = sm.sampling()

I think this is due to behavior of the multiprocessing (macOS started to use spawn instead of fork)

See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

pbhambhani · August 20, 2020, 6:30am

Thanks for the reply. I have been running it in a jupyter notebook so far.

I tried running it in the __name__ == "__main__" block, and the error still persists :(

From the multiprocessing docs, it seems the change to spawn has started since Python 3.8. I could try using a lower version of Python 3 to see if that fixes this error.

ahartikainen · August 20, 2020, 7:35pm

Could you add this in first cell before imports

import multiprocessing
multiprocessing.set_start_method("fork")

pbhambhani · August 21, 2020, 4:31am

That works indeed! Thank you.

I was going to ask if this should be logged as a PyStan issue, but looks like someone already raised this back in April. https://github.com/stan-dev/pystan/issues/693

My_Work · November 13, 2020, 12:13pm

Hi,

it also worked for me but stopped. If I run it now:

import multiprocessing
multiprocessing.set_start_method("fork")

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-27-b0fa033a5f53> in <module>
      1 import multiprocessing
----> 2 multiprocessing.set_start_method("fork")

~/anaconda3/lib/python3.8/multiprocessing/context.py in set_start_method(self, method, force)
    241     def set_start_method(self, method, force=False):
    242         if self._actual_context is not None and not force:
--> 243             raise RuntimeError('context has already been set')
    244         if method is None and force:
    245             self._actual_context = None

RuntimeError: context has already been set

and the terminal window says the same as before:

Process SpawnPoolWorker-212:
Process SpawnPoolWorker-211:
Process SpawnPoolWorker-213:
Process SpawnPoolWorker-214:
Traceback (most recent call last):
  File "/Users/jan/anaconda3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/jan/anaconda3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/jan/anaconda3/lib/python3.8/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/Users/jan/anaconda3/lib/python3.8/multiprocessing/queues.py", line 358, in get
    return _ForkingPickler.loads(res)
ModuleNotFoundError: No module named 'stanfit4anon_model_649148968e47447c8e6aa386bb89c275_3157769158311472631'
Traceback (most recent call last):
.
.
.

Any idea what is happening and how to fix it?
Thanks

ahartikainen · November 13, 2020, 12:39pm

What python version do you have?

edit. Do you run some code before calling that line?

My_Work · November 13, 2020, 12:42pm

Python 3.8.5 and yes. Actually, when I now restarted the kernel, it worked. I can try to reproduce how it happened (I run some PYMC3 models,could that have an effect?)

ahartikainen · November 13, 2020, 12:43pm

Yes, I think they do some set-up too.

Topic		Replies	Views
New to Pystan, Always get this error when attempting to sample: ModuleNotFoundError: No module named 'stanfit4anon_model...' Modeling pystan	9	5278	February 9, 2022
Trouble with PyStan 3 and python multiprocessing PyStan pystan	9	2303	October 27, 2021
Stan sampler gets stuck General	15	1424	May 17, 2020
Parallel chains hanging on Ubuntu 18.04 General	3	418	April 30, 2020
Unable to get PyStan working on a Mac PyStan fitting-issues	1	1597	December 1, 2020

PyStan throws error when running chains in parallel (n_jobs > 1)

Related topics