Unpickle saved results without stan model, pystan

Hi all,

I have run a stan model in for loop for different condition using the following command:

output_temp = pystan.stan(file=model_path, data=data_bandit_temp, 
                        chains=4, iter=2000, warmup=1000, thin=1, init='random', verbose=True, 
                        control = {"adapt_delta":0.95, "stepsize":1, "max_treedepth":10}, n_jobs=-1)

output.append(output_temp)

and this all putting in a list. Then I wanted to save the result using pickle:

#save the output        
with open("output_all_test.txt", "wb") as whata:   #Pickling
    pickle.dump(output, whata)

however got the following error:

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:68: UserWarning: Pickling fit objects is an experimental feature!
The relevant StanModel instance must be pickled along with this fit object.
When unpickling the StanModel must be unpickled first.

Questions:

  1. How do I save it properly?

  2. When I tried to unpickle it, it complains that the model is not present. Is there a way how to save and recover = load this data?

with open("output_all_test.txt", "rb") as fp:   # Unpickling
    b = pickle.load(fp)

Error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-8-2a2f12470a42> in <module>
      8 
      9 with open("output_all.txt", "rb") as fp:   # Unpickling
---> 10     b = pickle.load(fp)

ModuleNotFoundError: No module named 'stanfit4anon_model_29fef7469510d8c5b4085b6c92003df1_8213092754976547650'

Hi,

Yes, one should save the model with the fit. (Model is actually a python module and you need to load it for the fit)

There are a couple of ways.

I recommend that you split model compilation and fitting.

stan_model = pystan.StanModel(file=model_path)

outputs = []
for ...:
    fit = stan_model.sampling(data=...)
    outputs.append(fit)

# you can use compression too
import gzip
with gzip.open(...pickle.gz, "wb") as f:
    pickle.dump({"model" : stan_model, "outputs" : outputs})

This was assuming you use the same model for each fit; if not true, compile new model in a for loop and move model + fit together

.append([model, output])

And you can then just pickle the output list.

There are a couple of different ways to save the fit info too: arviz.InferenceData == NetCDF or pandas.DataFrame

import arviz as az
idata = az.from_pystan(fit)
idata.to_netcdf(path)
# latest summary info (updated rhat + ess)
az.summary(idata) # or az.summary(fit)
# plotting
az.plot_pair(idata)

https://arviz-devs.github.io/arviz/

# pandas
dataframe = fit.to_dataframe()
dataframe.to_csv(path)

To load data from old fits that don’t have the model saved, see https://pystan.readthedocs.io/en/latest/unpickling_fit_without_model.html

I recommend that you then save these values with arviz, because the model is not really related to the fit, so any additional calculation could be wrong.

1 Like

Hi @ahartikainen,

thank you very much for your answer, it covers most of the things. I have two minor problems remaining. I tried what you proposed for saving, normal pickle works, gzip provides the same error message:

with gzip.open('output_all_test_pickle.gz', "wb") as f:
    pickle.dump({"model" : stan_model, "outputs" : output}, f)

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: UserWarning: Pickling fit objects is an experimental feature!
The relevant StanModel instance must be pickled along with this fit object.
When unpickling the StanModel must be unpickled first.
  
  1. I also tried that before:
import pystan.experimental
pystan.experimental.unpickle_fit('output_all.txt')

but I got the same error of missing model:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-81-bd5f3e21db03> in <module>
      1 import pystan.experimental
----> 2 pystan.experimental.unpickle_fit('output_all.txt')
      3 
      4 #with open(data_path + "RS_all_bKarmSLR.csv", "rb") as fp:
      5 #    output_all = fp

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pystan\experimental\pickling_tools.py in unpickle_fit(path, open_func, open_kwargs, module_name, return_model)
    225 
    226     with open_func(path, **open_kwargs) as f:
--> 227         fit = pickle.load(f)
    228     if return_model:
    229         logger.warning("The model binary is built against fake model code. Model should not be used.")

ModuleNotFoundError: No module named 'stanfit4anon_model_29fef7469510d8c5b4085b6c92003df1_9201068612674835575'

It is not that big deal for me to rerun it and save properly, just want to mention.

It is a warning, not an error.

You could try to add module name manually

pystan.experimental.unpickle_fit('output_all.txt', module_name="stanfit4anon_model_29fef7469510d8c5b4085b6c92003df1_9201068612674835575")