Pickling error?

So far I have been exploring pystan/stan and greatly enjoying it. However, I just tried to run a NUTS sampler with 2 million samples on 12 threads with 800,000 burn-in for around 500 variables (two latent variable time series that are difficult to sample directly). The sampling took a while but finished successfully. However, on attempting to save the fit results I got a long error (that looked kind of like a segfault with hexadecimal strings) ending in

error: 'i' format requires -2147483648 <= number <= 2147483647

Looking on github, it seems the error is due to limitations of pickling. https://github.com/joblib/joblib/issues/387 Is there any way around this error, either via some fix, option I’m not using or hack?

Thanks in advance!

Do a few hundred iterations and stop. 800,000 and 2,000,000 are only reasonable for sampling schemes that do not mix well, and the whole point of using NUTS is that it does mix well for lots of models.

Hi Bgoodri, normally I’d agree, but with this particular model, the posterior distribution does not appear to settle down until >400,000 burn-in. With MCMC, I think it’d be impossible to accurately sample. I wanted to make sure it was truly settled down at 400,000 so wanted to run for 800,000 burn-in. The 2 million is because I am thinning to be safe. That is the first thing I’d cut, however, if there is no option but to reduce sample size (which seems like a major software limitation if that is the case.)

If it takes 400,000 warmup iterations, you have bigger problems than pickling. I would say the same thing at 4000.

There are some options:

  1. Run your model with n_jobs=1
  2. Use CmdStan
  3. Manually fix multiprocessing-module (Currently I don’t remember how that was done)