Initialization Error when converting from RStan to PyStan

Hello,

I’ve been following a Stan tutorial designed to be used for RStan (An introduction to Stan with R | R-bloggers), and I’m trying to adapt it for use in PyStan. I have copy+pasted the Stan code, and created an equivalent data dictionary in Python as shown below:

################# Stan Model ############################
model = """
functions {
  real[] dz_dt(real t, real[] z, real[] theta,
             real[] x_r, int[] x_i) {
      real u = z[1];
      real v = z[2];

      real alpha = theta[1];
      real beta = theta[2];
      real gamma = theta[3];
      real delta = theta[4];

      real du_dt = (alpha - beta * v) * u;
      real dv_dt = (-gamma + delta * u) * v;
      return { du_dt, dv_dt };
    }
}

data {
  int N;           // num measurements
  real ts[N];      // measurement times > 0
  real y_init[2];  // initial measured population
  real y[N, 2];    // measured population at measurement times
}

parameters {
  real theta[4];   // theta = {alpha, beta, gamma, delta}
  real z_init[2];  // initial population
  real sigma[2];   // error scale
}

transformed parameters {
  real z[N, 2]
    = integrate_ode_rk45(dz_dt, z_init, 0, ts, theta,
                         rep_array(0.0, 0), rep_array(0, 0),
                         1e-6, 1e-5, 1e3);
}

model {
  //Priors
  theta[{1, 3}] ~ normal(1, 0.5);
  theta[{2, 4}] ~ normal(0.05, 0.05);
  sigma ~ lognormal(-1, 1);
  z_init ~ lognormal(log(10), 1);
  
  //Likelihood
  for (k in 1:2) {
    y_init[k] ~ lognormal(log(z_init[k]), sigma[k]);
    y[ , k] ~ lognormal(log(z[, k]), sigma[k]);
  }
}
"""
######### Data  ############################
Year = np.arange(1900,1921,1)
Lynx = np.array([4.0, 6.1, 9.8, 35.2, 59.4, 41.7, 19.0, 13.0, 8.3, 9.1, 7.4,
                8.0, 12.3, 19.5, 45.7, 51.1, 29.7, 15.8, 9.7, 10.1, 8.6])
Hare = np.array([30.0, 47.2, 70.2, 77.4, 36.3, 20.6, 18.1, 21.4, 22.0, 25.4,
                 27.1, 40.3, 57.0, 76.6, 52.3, 19.5, 11.2, 7.6, 14.6, 16.2, 24.7])
N = len(Year) - 1
ts = np.arange(0,N)

y = np.array([Lynx[1:N+1], Hare[1:N+1]])
y_tr = y.transpose()
y_init = np.array([4.0, 6.1])

stan_data = {'N':N, 'ts':ts, 'y_init': y_init, 'y':y_tr}

############# Run Model ########################
sm = pystan.StanModel(model_code = model)
fit = sm.sampling(data = stan_data, chains = 4, iter = 1000)

But when I try to sample it, I get the following error.


RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
“”"
Traceback (most recent call last):
File “C:\Users\dms228\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py”, line 121, in worker
result = (True, func(*args, **kwds))
File “C:\Users\dms228\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py”, line 44, in mapstar
return list(map(*args))
File “AppData\Local\Temp\pystan_kknlzgn1\stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012.pyx”, line 373, in stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012._call_sampler_star
File “AppData\Local\Temp\pystan_kknlzgn1\stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012.pyx”, line 406, in stanfit4anon_model_9b92107ae938d605eb9d6b8d3a92217e_768342438768418012._call_sampler
RuntimeError: Initialization failed.
“”"

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in
1 start = time.time()
----> 2 fit = sm.sampling(data = stan_data, chains = 4, iter = 1000)
3 end = time.time()
4 print("Time taken to run MCMC: ", round(end - start, 1), “secs”)

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
953 call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
954 call_sampler_star = self.module._call_sampler_star
→ 955 ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
956 samples = [smpl for _, smpl in ret_and_samples]
957

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
144 try:
145 pool = multiprocessing.Pool(processes=n_jobs)
→ 146 map_result = pool.map(function, args)
147 finally:
148 pool.close()

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
266 in a list that is returned.
267 ‘’’
→ 268 return self._map_async(func, iterable, mapstar, chunksize).get()
269
270 def starmap(self, func, iterable, chunksize=None):

~\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\multiprocessing\pool.py in get(self, timeout)
655 return self._value
656 else:
→ 657 raise self._value
658
659 def _set(self, i, obj):

RuntimeError: Initialization failed.

Any idea where I’m going wrong?

I’m not sure what’s wrong, but I noticed several missing error messages that may be more probative:

Rejecting initial value:
Rejecting initial value:
  Error evaluating the log probability at the initial value.
  Error evaluating the log probability at the initial value.
Exception: integrate_ode_rk45: initial time is 0, but must be less than 0  (in 'unknown file name' at line 33)
Exception: integrate_ode_rk45: initial time is 0, but must be less than 0  (in 'unknown file name' at line 33)
--- snip ---
Initialization between (-2, 2) failed after 100 attempts. 
 Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

If you follow the link to the original Stan Case Study you’ll see the R-Bloggers tutorial copied the code incorrectly. The parameters block should have constraints.

parameters {
  real<lower = 0> theta[4];   // theta = { alpha, beta, gamma, delta }
  real<lower = 0> z_init[2];  // initial population
  real<lower = 0> sigma[2];   // error scale
}

This mistake causes lots of warning but usually does not prevent initialization. The problem you’re seeing is explained by the error @jjramsey found. The integrator does not allow the initial time (zero in this case) to be part of the output times ts. Your Python code should have

ts = np.arange(1,N+1)

which gives the same range as ts = 1:N in R.

Ah, makes sense. Thanks so much! I also had to add in a few more lower limits where they were missing, but the code works now :)