Frustrating problems with Pystan (anaconda Python)

Hello everyone,

I am new to Stan, and I am currently trying to run a Bayesian model using Pystan as my interface. I’m not sure if it is just me or everyone else also had any frustrating experiences in using Pystan.

When I installed pystan on my mac-os using pip install pystan, for some reasons I got Pystan 2.x installed and I was not able to upgrade it to 3.x despite having specified the library version. For days (if not weeks), I always got some errors I could not figure out why, and the problems were somehow solved when I decided to work with pystan 3.x on linux. As far as I know, I did not change anything on the model itself, only on the Python syntax (e.g. from import pystan to import stan)

Now, I have not run my script for quite a while, it was fine last time, but now suddenly I got an error FileNotFoundError: [Errno 2] No such file or directory: '/home/user_name/.cache/httpstan/4.4.2/models/6ih6n2q6/fits/ifaffpmx.jsonlines.lz4'
I suppose this has nothing to do with my script, but I also cannot figure out what it means.

What frustrated me even more, when I tried running the same script on a different machine (also on linux), I got a different error:
untimeError: Exception during call to services function: `IndexError("Exception: array[uni, ...] index: accessing element out of range. index 3 out of range; expecting index to be between 1 and 2 (in '/tmp/httpstan_5sk13x19/model_l7scj23i.stan', line 69, column 8 to column 33)")`, traceback: `[' File "/home/user_name/miniconda3/envs/env/lib/python3.8/site-packages/httpstan/services_stub.py", line 159, in call\n future.result()\n']

I also have another script that works just fine in one machine, but is broken in another one (saying that RuntimeError: Initialization failed.).

Does anyone have any problem similar to these? Is Pystan not a reliable interface? Does it have anything to do with my Python versions, etc. If so, what are the recommended versions to use?

Hi, sorry to hear that you have problems installing and using pystan. PyStan certainly is not the easiest package to install.

Let me check what we can do for some of the problems

Ther could be couple of reasons why this happens. First, make sure that pip that you are using is for python 3 (I think by default, pip and python on macos are for python 2.7 which is not supported by pystan).

Easiest way to check this, is to first find what is the python called that you want to use, e.g. python3, and then use that python executable to call pip

python3 -m pip install pystan

Now, there could also be a problem with your pip version. Many OS still has old pip installed by default, which doesn’t support the latest wheel types, but I think in this case, it probably would start to complain that correct httpstan version could not be found.

To update your pip tool

python3 -m pip install pip -U

HTTPStan library will automatically cache your models, to reduce unnecessary compilation times, but I think in this case this cache has been either cleaned or something else non-normal has happened. Either case, definable a bug in httpstan, which should not fail in these cases, but just recompile the model. Let’s add a bug report to httpstan github so we can fix this issue.

This is probably a case of a bug in your Stan code. I think the Stanc3 version that httpstan uses has these bound checks in place and will complain if there are any problems related to it.

This being said, the exception message could be a bit more verbose, so users could have an idea what to do next.

This is interesting problem, which was also seen on pystan 2 (or I think any Stan backend version). Usually this means that your model is brittle and it fails to find initial values. Why it only fails on one computer and not another one is something I don’t have a good answer. Maybe something closer to hardware is implemented differently and this causes the other machine to fail easier?

Yes, your are not the only one with these problems. We are actively working on these problems.

PyStan needs python 3.7 or later and we recommend using the latest python version, currently 3.9 (given that we have a wheel for it, e.g. when python 3.10 is released we first need to wait that our dependencies support this version and after that we can release a new set of wheels. This can take anywhere from days to months)

ps. Windows OS is not currently supported, but we do recommend users trying WSL (Linux on Windows)
ps2. here is a link for our FAQ Frequently Asked Questions — pystan 3.2.0 documentation

Also a bit of-topic, for post-sampling analysis you can use ArviZ lib ArviZ: Exploratory analysis of Bayesian models — ArviZ dev documentation

1 Like

one goal of the CmdStanPy interface is to simplify the install process, and it too plays nicely with ArviZ for downstream processing - Installation — CmdStanPy 0.9.77 documentation

Hi, thanks for your prompt reply.

So, I tried running some tests again on 3 different machines, here are the specs:

  1. Python 3.9.6, pystan 3.1.1 (installed via conda), gcc 9.3.0. I got the following error message:
Traceback (most recent call last):
  File "/home/mardian/bayesian_inference/IntermediaryModel.py", line 466, in <module>
    fit = posterior.sample(num_chains=6, num_warmup=5000, num_samples=5000)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/model.py", line 84, in sample
    return self.hmc_nuts_diag_e_adapt(num_chains=num_chains, **kwargs)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/model.py", line 103, in hmc_nuts_diag_e_adapt
    return self._create_fit(function=function, num_chains=num_chains, **kwargs)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/model.py", line 306, in _create_fit
    return asyncio.run(go())
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/model.py", line 231, in go
    raise RuntimeError(message)
RuntimeError: Exception during call to services function: `BrokenProcessPool('A process in the process pool was terminated abruptly while the future was running or pending.')`, traceback: `['  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/httpstan/services_stub.py", line 159, in call\n    future.result()\n', '  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/httpstan/services_stub.py", line 159, in call\n    future.result()\n']`
  1. Python 3.8.5, pystan 3.2.0 (installed via conda), gcc 9.3.0, I got the following error message:
Traceback (most recent call last):
  File "IntermediaryModel.py", line 466, in <module>
    fit = posterior.sample(num_chains=6, num_warmup=5000, num_samples=5000)
  File "/home/mardian/miniconda3/envs/env/lib/python3.8/site-packages/stan/model.py", line 84, in sample
    return self.hmc_nuts_diag_e_adapt(num_chains=num_chains, **kwargs)
  File "/home/mardian/miniconda3/envs/env/lib/python3.8/site-packages/stan/model.py", line 103, in hmc_nuts_diag_e_adapt
    return self._create_fit(function=function, num_chains=num_chains, **kwargs)
  File "/home/mardian/miniconda3/envs/env/lib/python3.8/site-packages/stan/model.py", line 306, in _create_fit
    return asyncio.run(go())
  File "/home/mardian/miniconda3/envs/env/lib/python3.8/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/home/mardian/miniconda3/envs/env/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/mardian/miniconda3/envs/env/lib/python3.8/site-packages/stan/model.py", line 231, in go
    raise RuntimeError(message)
RuntimeError: Exception during call to services function: `IndexError("Exception: array[uni, ...] index: accessing element out of range. index 3 out of range; expecting index to be between 1 and 2 (in '/tmp/httpstan_5sk13x19/model_l7scj23i.stan', line 69, column 8 to column 33)")`, traceback: `['  File "/home/mardian/miniconda3/envs/env/lib/python3.8/site-packages/httpstan/services_stub.py", line 159, in call\n    future.result()\n']`
  1. Python 3.9.6, pystan 3.2.0 (installed via conda), gcc 9.3.0, here are the error message:
Traceback (most recent call last):
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/httpstan/views.py", line 103, in handle_create_model
    _, stanc_warnings = httpstan.compile.compile(program_code, stan_model_name)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/httpstan/compile.py", line 24, in compile
    with importlib.resources.path(__package__, "stanc") as stanc_binary:
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/contextlib.py", line 117, in __enter__
    return next(self.gen)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/importlib/resources.py", line 175, in _path_from_reader
    opener_reader = reader.open_resource(norm_resource)
  File "<frozen importlib._bootstrap_external>", line 1055, in open_resource
FileNotFoundError: [Errno 2] No such file or directory: '/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/httpstan/stanc'
Traceback (most recent call last):
  File "/home/mardian/bayesian_inference/IntermediaryModel.py", line 465, in <module>
    posterior = stan.build(model, data=data)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/model.py", line 512, in build
    return asyncio.run(go())
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/model.py", line 479, in go
    match = re.search(r"""ValueError\(['"](.*)['"]\)""", resp.json()["message"])
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/stan/common.py", line 24, in json
    return simdjson.loads(self.content)
  File "/home/mardian/anaconda3/envs/env/lib/python3.9/site-packages/simdjson/__init__.py", line 52, in loads
    return parser.parse(s, True)
  File "simdjson/csimdjson.pyx", line 455, in csimdjson.Parser.parse
ValueError: The JSON document has an improper structure: missing or superfluous commas, braces, missing keys, etc.

I do not understand why I got 3 different errors for the same script that I run on different machines. I also recompile the model (clean the cache, as suggested), but still got these errors. How can I fix these?

Thanks

Can you try running the example code shown in PyStan — pystan 3.2.0 documentation

Does this happen also with 1 chain? I wonder if conda somehow affects our multiprocessing call.

Can you try pip version?

I think there is bug in your Stan code. Can you try running the example mentioned previously?

Again, I wonder if there is something weird going on with conda packages. Can you try with pip versions (both pystan and simdjson)?

I tried running the example code as you suggested in a separate environment variable where I installed everything via pip. I still got the same error message under Python 3.9.5, pystan 3.1.1

Traceback (most recent call last):
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/httpstan/views.py", line 253, in handle_show_params
    services_module = httpstan.models.import_services_extension_module(model_name)
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/httpstan/models.py", line 90, in import_services_extension_module
    module: ModuleType = importlib.util.module_from_spec(spec)  # type: ignore
  File "<frozen importlib._bootstrap>", line 565, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1173, in create_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
ImportError: /home/mardian/.cache/httpstan/4.4.2/models/tbv6gc4i/stan_services_model_tbv6gc4i.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNSt19basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
Traceback (most recent call last):
  File "/home/mardian/bayesian_inference/example.py", line 27, in <module>
    posterior = stan.build(schools_code, data=schools_data)
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/stan/model.py", line 511, in build
    return asyncio.run(go())
  File "/home/mardian/anaconda3/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/mardian/anaconda3/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/stan/model.py", line 503, in go
    raise RuntimeError(resp.json()["message"])
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/stan/common.py", line 24, in json
    return simdjson.loads(self.content)
  File "/home/mardian/bayesian_inference/pip-env/lib/python3.9/site-packages/simdjson/__init__.py", line 61, in loads
    return parser.parse(s, True)
ValueError: The JSON document has an improper structure: missing or superfluous commas, braces, missing keys, etc.

Apparently this error was not from my model in the previous script. Any ideas what might cause this?
If you think that conda is the one who cause all of these trouble, how come I was able to run my script previously with no error?

No, I don’t think there is any problems with conda, but one needs to start to debug somewhere.

For example, the BrokenProcessPool error is something we saw previously on macOS (Failing to get it working on mac osx 11.2 · Issue #247 · stan-dev/pystan · GitHub), but this should be fixed now (fix: Require multiprocessing to use "fork" by riddell-stan · Pull Request #561 · stan-dev/httpstan · GitHub → httpstan 4.4.2).

You could try to call your code inside if __name__ == "__main__": block.

But probably something is getting errored inside the subprocess and this error is never shown in python main process, just the BrokenProcess exception is raised. I wonder if there is a difference how Linux and macOS handles the error.

This is a new error msg for 3.1.1?

Again, here I think something is failing in httpstan side, which causes the output be empty or contain invalid items. We should need to see the underlying error, e.g. the msg is probably containing something that shouldn’t be there or maybe it is empty. Or maybe the payload size is too big?

cc @ariddell

Actually, the last error was from the example code so I don’t think the payload size is that big. I am still not able to test in on mac, because even updating my pip to the latest and make sure it in Python3, I still got pystan 2.19 installed. I got another error if I tried to force version 3.2.0

ERROR: Could not find a version that satisfies the requirement httpstan<4.6,>=4.5 (from pystan) (from versions: 0.2.5, 0.3.0, 0.3.1, 0.4.0, 0.5.0, 0.6.0, 0.7.2, 0.7.3, 0.7.5, 0.7.6, 0.8.0, 0.9.0, 0.10.1, 4.0.0)
ERROR: No matching distribution found for httpstan<4.6,>=4.5

You might need to upgrade pip with python3 -m pip install -U pip. Then you should be able to install the newest pystan.

The reason you need to upgrade pip is that the binary wheel formats keep changing. Only newer versions of pip know about the newer versions.

(This only applies to traditional Python installations. Conda isn’t supported as far as I know.)

I did actually upgrade my pip before attempting to install pystan (I am using pip 21.2.4). It should be latest pip version, but I still got pystan 2.19 on Mac. It was working fine with Ubuntu though.

It may be conda then. PyStan isn’t tested on conda at all.

I’d suggest trying to get things working using the global environment or creating a virtualenv.

You’re the first person who has had this precise experience, I think. If someone else encounters this, I think we would probably want to add an FAQ item mentioning that things are unlikely to work under conda.

I would not recommend using global python on macOS.

Also, if you have conda, how do you call your python and how did you call pip? Make sure that you really have updated your pip and the pip you use is actually the correct pip for your python.

I did use virtualenv.

Yes, my pip is updated, using python3 -m pip install pip -U. I have triple checked about my pip version.

As for conda, I called my script from inside conda environment with python3 script-name.py

Let me summarize it again, so that it does not get mixed up. Here are my problems:

  1. I was not able to install pystan 3.x on Mac, I was using pip 21.2.4 on Python 3.9.2. I still got pystan 2.19 installed no matter how often I repeatedly tried to update my pip, and specifying python3 -m pip install pystan==3.2.0 did not work too. I got a message ERROR: Could not find a version that satisfies the requirement httpstan<4.6,>=4.5 (from pystan)

  2. I then tried on 3 different Linux machines, all either on Python 3.8 or 3.9. I was able to install pystan 3.1.1 or 3.2.0. But then I got 3 different errors on these different machines (see above replies). I tried on both virtualenv (using pip) and conda environment. I tried the example code from the website as well. It worked in one machine, it did not in another one. What confused me more, I have another script that was running fine before suddenly did not work, and I got one of those errors I have posted above. I don’t think there is anything wrong with my machines specs, and I do not think the error came from my model/script as well. I had no idea what the error messages mean, and if there is no quick fix to these, I might as well move on with another framework.

Is that macOS M1 machine?

1 Like

I am not too sure about that, but I assume no. It is Macbook pro retina 13" late 2013, intel i5 2.4 GHz, 8GB DDR3 RAM

Ok, not M1.

What OS version do you have? 10.14?

I think we don’t have wheels for it, only for 10.15 (@ariddell ?)

How do you run these Linux machines? Are they native or in VM? If VM, there might be some memory setting that is overflowing and causes the script to crash.

It might be easiest if we dealt with each platform separately.

If you cannot get things working on a supported platform (see https://pystan.readthedocs.io/en/latest/installation.html for requirements) then you should open a bug report at https://github.com/stan-dev/pystan/issues. Might make sense to start with Linux since it’s very likely we can quickly resolve the issue.

Conda is not supported right now so let’s leave that platform/distribution to one side.

The macOS is actually 10.14. Do I have to update it to 10.15 to use pystan 3.x on Mac then?
I did not find any information related to macOS version requirements in the documentation.

As for the Ubuntu machines, they are all in VMs. I just realized that the OS are either 16.04 or 18.04. Any chance the issues were because of the OS version as well? But then, it did not answer my question, why I used to be able to run my scripts without error previously under the same OS? If it is memory setting, how can I find out the way to solve the issue? If these issues are due to the slight variations of the OS versions/machine specs, is there anywhere in the documentation about detailed specs required to run pystan smoothly?

Neither 16.04 nor 18.04 are supported. So that solves that problem.

I don’t really understand macOS wheel compatibility. We rely on GitHub Actions and the multibuild project for building the macOS wheels. We set a variable MACOSX_DEPLOYMENT_TARGET: "10.9" – but perhaps this is being ignored, or is impossible to satisfy for some reason.

Wish I could be of more assistance.

Hi, I think conda has a version that might work.

Can you try the following (probably a good idea to use a new conda environment which is created with conda-forge flag)

conda create -n stan -c conda-forge python=3.9
conda activate stan # or source activate stan


conda install pystan httpstan -c conda-forge