Save fit model in pystan 2

cmlakhan · October 16, 2017, 2:51pm

In reading the pystan documentation, I am a little confused about how I save my fitted model. I originally followed the example here

https://pystan.readthedocs.io/en/latest/avoiding_recompilation.html

but it seems that just saves a compiled model and not fitted data. I am running a model that looks like the following

fit = model.sampling(data=data,iter=10000,warmup=8000, chains=2)

How do I go about saving this model fit for future use? Do I simply use pickle as follows?

with open(‘fit.pkl’, ‘wb’) as f:
pickle.dump(fit, f)

jjramsey · October 16, 2017, 4:32pm

From my brief time using PyStan some months back, pickling the fit object seemed to work. However, when I did it, there was a warning message indicating that this was an experimental feature of PyStan, and that to one should unpickle the model associated with the pickled fit object before unpickling the fit object itself.

What I ultimately ended up doing (before switching to RStan) was to pickle certain outputs from the fit object, such as the dictionaries returned from running fit.extract() and fit.get_sampler_params().

cmlakhan · October 16, 2017, 5:58pm

ah thanks so much, will try that!

ahartikainen · October 16, 2017, 6:40pm

You need to import (unpickle) model before the fit object. So one way to do this is to save the model and the data in a dictionary. This works with python 3.6, where the dictionary is ordered. (use ordered dict or list otherwise)

import pickle
with open("model_fit.pkl", "wb") as f:
    pickle.dump({'model' : model, 'fit' : fit}, f, protocol=-1)
    # or with a list
    # pickle.dump([model, fit], f, protocol=-1)

and then later (with the same os, not cross-platform compatible)

import pickle
with open("model_fit.pkl", "rb") as f:
    data_dict = pickle.load(f)
    # or with a list
    # data_list = pickle.load(f)
fit = data_dict['fit']
# fit = data_list[1]

cmlakhan · October 17, 2017, 5:44pm

Do you have any recommendation for how to save if I wanted to open in R?

cmlakhan · October 17, 2017, 5:46pm

I just discovered ShinyStan and it would be great to load into R and use it.

jjramsey · October 17, 2017, 6:01pm

If you’re talking about saving in Python to open in R, I don’t know if there are any great options, especially if you are aiming to use ShinyStan. You can use Python’s json module to dump dictionaries to JSON files, and then read in those files using the jsonlite R package on CRAN. That may be too limiting, though. There’s also the feather library, but that is for communicating data frames between R and Python, and may also be limiting.

However, if you’re talking about saving in one R session to open in another session, then you can simply save the fit object to a file with saveRDS() and open it with readRDS(). (There are a few functions that don’t work with saved fit objects, though. See here: https://groups.google.com/forum/#!msg/stan-users/XRqaIyh96Lo/mAb6YC4MBwAJ)

cmlakhan · October 18, 2017, 3:18pm

Yeah, I meant the former. Will play around with your suggestion and see if I can get it to work.

deltasata · April 19, 2020, 1:18pm

I followed this dumping and reading method, but I am facing some weird problem. After saving I can read in the same python session (be it same script, same ipython session or same jupyter notebook). In this successful reading case we get:

Stan model: anon_model_c36d319d3ed03c96dd75853df01607bc

But if Itry to read the pkl file with the same code (as given by you) in a different session (diffeter python script ot jupyter notebook) I get the following error:
ModuleNotFoundError: No module named ‘stanfit4anon_model_c36d319d3ed03c96dd75853df01607bc_7926293176758986504’

This is so confusing. On the top of that, when I try to follow the link
https://pystan.readthedocs.io/en/latest/unpickling_fit_without_model.html
I get the error:

AttributeError: module ‘pystan.experimental’ has no attribute ‘unpickle_fit’

Again I fail to understand why there are so much problems in pystan with thiese basic steps.

ariddell · April 19, 2020, 1:23pm

Using pickle to save fits is not supported unless the model is also pickled. I’m not sure the warning at the top of the linked documentation page (“This feature is experimental and nothing is guaranteed.”) is strong enough.

This is a known issue and will be resolved in PyStan 3.

edit: add “unless the model is also pickled”

deltasata · April 19, 2020, 1:27pm

If so, why this issue has not been mentioned here (in this thread). Also, do you agree how peculiar this problem is? One can read pkl file in the same session but not in a different session. It requires some time to digest this.

ahartikainen · April 19, 2020, 1:39pm

What pystan version are you using?

Did you save your fit with the model? E.g. in alist where model is first item?

ariddell · April 19, 2020, 1:39pm

If you’re careful to pickle both the model and the fit, things should work. The fit depends on the model.

We welcome suggestions for changes to the documentation.

deltasata · April 20, 2020, 7:17am

I could not find a code to check installed pystan version. However I could find from the conda list; the pystan version is 2.19.0.0. I exactly followed your code:

with open(“model_fit.pkl”, “wb”) as f:
pickle.dump({‘model’ : model, ‘fit’ : fit}, f, protocol=-1)
# or with a list
# pickle.dump([model, fit], f, protocol=-1)

So, I saved the model. I already mentioned that I could read it perfectly in the same session (be it python script or jupyter notebook). But with a different session (on the same computer/operating system) reading with the same code throws back the error. Also, inthe latter case I find that a random number is added after the correct stanfit4anon_model while reading.

deltasata · April 20, 2020, 7:17am

But it is not working for me.

ahartikainen · April 20, 2020, 7:39am

print(pystan.__version__)

What python do you use? 2.7 and <3.6 don’t always handle order “correctly” for dicts.

That extra string is not a problem.

deltasata · April 20, 2020, 8:19am

pystan=2.19.0.0
python=3.6.9

Also checked on another computer with python 3.7 and facing the same issue!

Could you please confirm that on your computer this reading function works without any issue in a different python session? If yes, could you please let me know your pystan, python versions?

ahartikainen · April 20, 2020, 8:51am

These work. (You need to have pystan installed on both times)

run_pystan.py (339 Bytes) run_pystan2.py (181 Bytes)

Tested with Python 3.7 and PyStan 2.19.1.1

deltasata · April 20, 2020, 9:32am

These two scripts work perfectly on my computer! But when I do exactly same (while both saving and reading) with my own fitting/sampling, the reading does not work (only when I try to read from a different session). I will try to prepare a minimal working example for this problem.
I am super confused.

deltasata · April 20, 2020, 11:13am

Okay, after a detailed look found my problem. I was dumping model code instead of the pystan.StanModel(). The main problem was my erroneous method worked perfectly when I tested in the same jupyter notebook session and I build the whole pipeline. Now I have to change everything. A big thanks to you for all the helps.

Topic		Replies	Views
Unpickle saved results without stan model, pystan Modeling	3	1896	December 31, 2019
How to reuse the model parameter? Modeling	1	569	April 24, 2019
PyStan with pickle only saves first 100 values? PyStan specification	4	792	August 23, 2020
Pickling and unpickling fit objects Modeling techniques	1	742	October 14, 2019
How to re-load fits saved in cache instead of re-sampling with PyStan PyStan	2	319	July 25, 2023

Save fit model in pystan 2

Related topics