and this all putting in a list. Then I wanted to save the result using pickle:
#save the output
with open("output_all_test.txt", "wb") as whata: #Pickling
pickle.dump(output, whata)
however got the following error:
C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:68: UserWarning: Pickling fit objects is an experimental feature!
The relevant StanModel instance must be pickled along with this fit object.
When unpickling the StanModel must be unpickled first.
Questions:
How do I save it properly?
When I tried to unpickle it, it complains that the model is not present. Is there a way how to save and recover = load this data?
with open("output_all_test.txt", "rb") as fp: # Unpickling
b = pickle.load(fp)
Error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-8-2a2f12470a42> in <module>
8
9 with open("output_all.txt", "rb") as fp: # Unpickling
---> 10 b = pickle.load(fp)
ModuleNotFoundError: No module named 'stanfit4anon_model_29fef7469510d8c5b4085b6c92003df1_8213092754976547650'
Yes, one should save the model with the fit. (Model is actually a python module and you need to load it for the fit)
There are a couple of ways.
I recommend that you split model compilation and fitting.
stan_model = pystan.StanModel(file=model_path)
outputs = []
for ...:
fit = stan_model.sampling(data=...)
outputs.append(fit)
# you can use compression too
import gzip
with gzip.open(...pickle.gz, "wb") as f:
pickle.dump({"model" : stan_model, "outputs" : outputs})
This was assuming you use the same model for each fit; if not true, compile new model in a for loop and move model + fit together
.append([model, output])
And you can then just pickle the output list.
There are a couple of different ways to save the fit info too: arviz.InferenceData == NetCDF or pandas.DataFrame
import arviz as az
idata = az.from_pystan(fit)
idata.to_netcdf(path)
# latest summary info (updated rhat + ess)
az.summary(idata) # or az.summary(fit)
# plotting
az.plot_pair(idata)
I recommend that you then save these values with arviz, because the model is not really related to the fit, so any additional calculation could be wrong.
thank you very much for your answer, it covers most of the things. I have two minor problems remaining. I tried what you proposed for saving, normal pickle works, gzip provides the same error message:
with gzip.open('output_all_test_pickle.gz', "wb") as f:
pickle.dump({"model" : stan_model, "outputs" : output}, f)
C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: UserWarning: Pickling fit objects is an experimental feature!
The relevant StanModel instance must be pickled along with this fit object.
When unpickling the StanModel must be unpickled first.
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-81-bd5f3e21db03> in <module>
1 import pystan.experimental
----> 2 pystan.experimental.unpickle_fit('output_all.txt')
3
4 #with open(data_path + "RS_all_bKarmSLR.csv", "rb") as fp:
5 # output_all = fp
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pystan\experimental\pickling_tools.py in unpickle_fit(path, open_func, open_kwargs, module_name, return_model)
225
226 with open_func(path, **open_kwargs) as f:
--> 227 fit = pickle.load(f)
228 if return_model:
229 logger.warning("The model binary is built against fake model code. Model should not be used.")
ModuleNotFoundError: No module named 'stanfit4anon_model_29fef7469510d8c5b4085b6c92003df1_9201068612674835575'
It is not that big deal for me to rerun it and save properly, just want to mention.