From my brief time using PyStan some months back, pickling the fit object seemed to work. However, when I did it, there was a warning message indicating that this was an experimental feature of PyStan, and that to one should unpickle the model associated with the pickled fit object before unpickling the fit object itself.
What I ultimately ended up doing (before switching to RStan) was to pickle certain outputs from the fit object, such as the dictionaries returned from running fit.extract() and fit.get_sampler_params().
You need to import (unpickle) model before the fit object. So one way to do this is to save the model and the data in a dictionary. This works with python 3.6, where the dictionary is ordered. (use ordered dict or list otherwise)
import pickle
with open("model_fit.pkl", "wb") as f:
pickle.dump({'model' : model, 'fit' : fit}, f, protocol=-1)
# or with a list
# pickle.dump([model, fit], f, protocol=-1)
and then later (with the same os, not cross-platform compatible)
import pickle
with open("model_fit.pkl", "rb") as f:
data_dict = pickle.load(f)
# or with a list
# data_list = pickle.load(f)
fit = data_dict['fit']
# fit = data_list[1]
If you’re talking about saving in Python to open in R, I don’t know if there are any great options, especially if you are aiming to use ShinyStan. You can use Python’s json module to dump dictionaries to JSON files, and then read in those files using the jsonlite R package on CRAN. That may be too limiting, though. There’s also the feather library, but that is for communicating data frames between R and Python, and may also be limiting.
However, if you’re talking about saving in one R session to open in another session, then you can simply save the fit object to a file with saveRDS() and open it with readRDS(). (There are a few functions that don’t work with saved fit objects, though. See here: https://groups.google.com/forum/#!msg/stan-users/XRqaIyh96Lo/mAb6YC4MBwAJ)
I followed this dumping and reading method, but I am facing some weird problem. After saving I can read in the same python session (be it same script, same ipython session or same jupyter notebook). In this successful reading case we get:
Stan model: anon_model_c36d319d3ed03c96dd75853df01607bc
But if Itry to read the pkl file with the same code (as given by you) in a different session (diffeter python script ot jupyter notebook) I get the following error:
ModuleNotFoundError: No module named ‘stanfit4anon_model_c36d319d3ed03c96dd75853df01607bc_7926293176758986504’
Using pickle to save fits is not supported unless the model is also pickled. I’m not sure the warning at the top of the linked documentation page (“This feature is experimental and nothing is guaranteed.”) is strong enough.
This is a known issue and will be resolved in PyStan 3.
If so, why this issue has not been mentioned here (in this thread). Also, do you agree how peculiar this problem is? One can read pkl file in the same session but not in a different session. It requires some time to digest this.
I could not find a code to check installed pystan version. However I could find from the conda list; the pystan version is 2.19.0.0. I exactly followed your code:
with open(“model_fit.pkl”, “wb”) as f:
pickle.dump({‘model’ : model, ‘fit’ : fit}, f, protocol=-1)
# or with a list
# pickle.dump([model, fit], f, protocol=-1)
So, I saved the model. I already mentioned that I could read it perfectly in the same session (be it python script or jupyter notebook). But with a different session (on the same computer/operating system) reading with the same code throws back the error. Also, inthe latter case I find that a random number is added after the correct stanfit4anon_model while reading.
Also checked on another computer with python 3.7 and facing the same issue!
Could you please confirm that on your computer this reading function works without any issue in a different python session? If yes, could you please let me know your pystan, python versions?
These two scripts work perfectly on my computer! But when I do exactly same (while both saving and reading) with my own fitting/sampling, the reading does not work (only when I try to read from a different session). I will try to prepare a minimal working example for this problem.
I am super confused.
Okay, after a detailed look found my problem. I was dumping model code instead of the pystan.StanModel(). The main problem was my erroneous method worked perfectly when I tested in the same jupyter notebook session and I build the whole pipeline. Now I have to change everything. A big thanks to you for all the helps.