PyStan: RuntimeError: Initialization Failed

I’m trying out some basic examples in PyStan (2.17.1.0, Python 3.6.3, Linux64) and after some initial successes I’m having strange crashing issues.

I started out working through this tutorial (http://mc-stan.org/users/documentation/case-studies/pool-binary-trials.html), but in python with my jupyter notebook, and I was able to work through everything without any problems.

I then tried to fit my own data which follows a similar format, and for simplicity’s sake drop the terms needed for posterior predictive. I get the following error:

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/pmccarthy/.pyenv/versions/3.6.3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/pmccarthy/.pyenv/versions/3.6.3/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_3663626409437619881.pyx", line 368, in stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_3663626409437619881._call_sampler_star
  File "stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_3663626409437619881.pyx", line 401, in stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_3663626409437619881._call_sampler
RuntimeError: Initialization failed.
"""

Following a hint I found here, I tried setting n_jobs=1 to avoid any problems with multiprocessing, but that only gives me this error instead:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-54-0447bc94b31b> in <module>()
      1 M = 10000
----> 2 complete_pooled_fit = pystan.stan(model_code = complete_pooling, data=data, iter=int(M/2),chains=4,n_jobs=1)

~/.pyenv/versions/demo/lib/python3.6/site-packages/pystan/api.py in stan(file, model_name, model_code, fit, data, pars, chains, iter, warmup, thin, init, seed, algorithm, control, sample_file, diagnostic_file, verbose, boost_lib, eigen_lib, n_jobs, **kwargs)
    400                      sample_file=sample_file, diagnostic_file=diagnostic_file,
    401                      verbose=verbose, algorithm=algorithm, control=control,
--> 402                      n_jobs=n_jobs, **kwargs)
    403     return fit

~/.pyenv/versions/demo/lib/python3.6/site-packages/pystan/model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    724         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    725         call_sampler_star = self.module._call_sampler_star
--> 726         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    727         samples = [smpl for _, smpl in ret_and_samples]
    728 

~/.pyenv/versions/demo/lib/python3.6/site-packages/pystan/model.py in _map_parallel(function, args, n_jobs)
     84             pool.join()
     85     else:
---> 86         map_result = list(map(function, args))
     87     return map_result
     88 

stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_9140702845920005778.pyx in stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_9140702845920005778._call_sampler_star()

stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_9140702845920005778.pyx in stanfit4anon_model_ac3fc03835e7b38559db8b2d950bb1f6_9140702845920005778._call_sampler()

RuntimeError: Initialization failed.

My code:

complete_pooling = """
data {
    int<lower=0> N;
    int<lower=1000> K[N]; // initial trials
    int<lower=10000> y[N]; // initial successes
    
  //  int<lower=0> K_new[N]; //new trials
  //  int<lower=0> y_new[N]; //new successes
}

parameters {
    real<lower=0, upper=1> phi; //chance of success (pooled), uniform by default
}

model {
    y ~ binomial(K,phi);
}
"""

data = {'N':N,
                       'K':K,
                       'y':y #,
#                        'K_new':K_new,
#                        'y_new':y_new
       }

M = 10000
complete_pooled_fit = pystan.stan(model_code = complete_pooling, data=data, iter=int(M/2),chains=4)

My data:

> data
{'K': 62       6579
 206      7187
 350      7748
 494      8321
 638      8738
 782      9111
 926      9685
 1070    10171
 1214    10695
 1358    11115
 1502    11565
 Name: male, dtype: int64, 'N': 11, 'y': 62      12141
 206     13223
 350     14239
 494     15286
 638     16012
 782     16770
 926     17824
 1070    18753
 1214    19699
 1358    20485
 1502    21309
 Name: total, dtype: int64}

Is your data pandas.Series?

Use numpy arrays (e.g. np.array(ser) or ser.values)

What compiler do you use?

It is series, I got that to work in the tutorial, but I did try as numpy arrays and even lists too, no change.

I’m reasonably sure the compiler is gcc but I’ll recheck.

Hmm, no wait, I reread the error msg: this is not a PyStan problem, it’s a Stan problem.

RuntimeError: Initialization failed.

Your model can not find initial values. See manual chapter 8.2.

http://mc-stan.org/users/documentation/index.html

With some sleep I figured it out. in the comments of my program I define

int<lower=1000> K[N]; // initial trials
int<lower=10000> y[N]; // initial successes

but when I declare my data, I swapped K and y: y > K.
Simple user error :)

1 Like