Initialization Error when converting from RStan to PyStan

Hello,

I’ve been following a Stan tutorial designed to be used for RStan (An introduction to Stan with R | R-bloggers), and I’m trying to adapt it for use in PyStan. I have copy+pasted the Stan code, and created an equivalent data dictionary in Python as shown below:

################# Stan Model ############################
model = """
functions {
  real[] dz_dt(real t, real[] z, real[] theta,
             real[] x_r, int[] x_i) {
      real u = z[1];
      real v = z[2];

      real alpha = theta[1];
      real beta = theta[2];
      real gamma = theta[3];
      real delta = theta[4];

      real du_dt = (alpha - beta * v) * u;
      real dv_dt = (-gamma + delta * u) * v;
      return { du_dt, dv_dt };
    }
}

data {
  int N;           // num measurements
  real ts[N];      // measurement times > 0
  real y_init[2];  // initial measured population
  real y[N, 2];    // measured population at measurement times
}

parameters {
  real theta[4];   // theta = {alpha, beta, gamma, delta}
  real z_init[2];  // initial population
  real sigma[2];   // error scale
}

transformed parameters {
  real z[N, 2]
    = integrate_ode_rk45(dz_dt, z_init, 0, ts, theta,
                         rep_array(0.0, 0), rep_array(0, 0),
                         1e-6, 1e-5, 1e3);
}

model {
  //Priors
  theta[{1, 3}] ~ normal(1, 0.5);
  theta[{2, 4}] ~ normal(0.05, 0.05);
  sigma ~ lognormal(-1, 1);
  z_init ~ lognormal(log(10), 1);
  
  //Likelihood
  for (k in 1:2) {
    y_init[k] ~ lognormal(log(z_init[k]), sigma[k]);
    y[ , k] ~ lognormal(log(z[, k]), sigma[k]);
  }
}
"""
######### Data  ############################
Year = np.arange(1900,1921,1)
Lynx = np.array([4.0, 6.1, 9.8, 35.2, 59.4, 41.7, 19.0, 13.0, 8.3, 9.1, 7.4,
                8.0, 12.3, 19.5, 45.7, 51.1, 29.7, 15.8, 9.7, 10.1, 8.6])
Hare = np.array([30.0, 47.2, 70.2, 77.4, 36.3, 20.6, 18.1, 21.4, 22.0, 25.4,
                 27.1, 40.3, 57.0, 76.6, 52.3, 19.5, 11.2, 7.6, 14.6, 16.2, 24.7])
N = len(Year) - 1
ts = np.arange(0,N)

y = np.array([Lynx[1:N+1], Hare[1:N+1]])
y_tr = y.transpose()
y_init = np.array([4.0, 6.1])

stan_data = {'N':N, 'ts':ts, 'y_init': y_init, 'y':y_tr}

############# Run Model ########################
sm = pystan.StanModel(model_code = model)
fit = sm.sampling(data = stan_data, chains = 4, iter = 1000)

But when I try to sample it, I get the following error.


RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
“”"
Traceback (most recent call last):
File “C:\Users\dms228\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py”, line 121, in worker
result = (True, func(*args, **kwds))
File “C:\Users\dms228\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py”, line 44, in mapstar
return list(map(*args))
File “AppData\Local\Temp\pystan_kknlzgn1\stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012.pyx”, line 373, in stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012._call_sampler_star
File “AppData\Local\Temp\pystan_kknlzgn1\stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012.pyx”, line 406, in stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012._call_sampler
RuntimeError: Initialization failed.
“”"

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in
1 start = time.time()
----> 2 fit = sm.sampling(data = stan_data, chains = 4, iter = 1000)
3 end = time.time()
4 print("Time taken to run MCMC: ", round(end - start, 1), “secs”)

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
953 call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
954 call_sampler_star = self.module._call_sampler_star
→ 955 ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
956 samples = [smpl for _, smpl in ret_and_samples]
957

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
144 try:
145 pool = multiprocessing.Pool(processes=n_jobs)
→ 146 map_result = pool.map(function, args)
147 finally:
148 pool.close()

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
266 in a list that is returned.
267 ‘’’
→ 268 return self._map_async(func, iterable, mapstar, chunksize).get()
269
270 def starmap(self, func, iterable, chunksize=None):

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py in get(self, timeout)
655 return self._value
656 else:
→ 657 raise self._value
658
659 def _set(self, i, obj):

RuntimeError: Initialization failed.

Any idea where I’m going wrong?

I’m not sure what’s wrong, but I noticed several missing error messages that may be more probative:

Rejecting initial value:
Rejecting initial value:
  Error evaluating the log probability at the initial value.
  Error evaluating the log probability at the initial value.
Exception: integrate_ode_rk45: initial time is 0, but must be less than 0  (in 'unknown file name' at line 33)
Exception: integrate_ode_rk45: initial time is 0, but must be less than 0  (in 'unknown file name' at line 33)
--- snip ---
Initialization between (-2, 2) failed after 100 attempts. 
 Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
1 Like

If you follow the link to the original Stan Case Study you’ll see the R-Bloggers tutorial copied the code incorrectly. The parameters block should have constraints.

parameters {
  real<lower = 0> theta[4];   // theta = { alpha, beta, gamma, delta }
  real<lower = 0> z_init[2];  // initial population
  real<lower = 0> sigma[2];   // error scale
}

This mistake causes lots of warning but usually does not prevent initialization. The problem you’re seeing is explained by the error @jjramsey found. The integrator does not allow the initial time (zero in this case) to be part of the output times ts. Your Python code should have

ts = np.arange(1,N+1)

which gives the same range as ts = 1:N in R.

1 Like

Ah, makes sense. Thanks so much! I also had to add in a few more lower limits where they were missing, but the code works now :)