I’m new to PyStan (or even Stan) and I was trying to convert my PyMC model into PyStan just to check if PyStan is any faster/robust for Bayesian time series analysis especially for large data. However, as soon as i started the sampling for MCMC, I ran into RuntimeError: Initialization failed.
which I suppose has more to do with Stan than PyStan? I am going wrong but not sure where/why this happens.
Stan model:
data {
int I; // Num. students
int K; // Num. skills
int max_T; // largest number in T
int T[I]; // #opportunities for each of the I students
int MAXSKILLS;
int idxY[I,max_T,MAXSKILLS];
int y[I,max_T,MAXSKILLS]; // output
}
parameters {
// Priors
vector[I] theta;
vector[K] lambda0;
vector[K] lambda1;
vector[K] learn;
vector[K] g;
vector[K] ss;
}
transformed parameters {}
model {
int alpha[I,max_T,K];
real value;
int idx;
real py[I, max_T, K];
theta ~ normal(0, 1);
lambda0 ~ uniform(0.0, 2.5);
lambda1 ~ normal(0, 1);
learn ~ beta(1, 1);
ones ~ bernoulli(1.0);
ss ~ uniform(0.5, 1.0);
g ~ uniform(0.0, 0.5);
for (i in 1:I){
// t = 1
for (k in 1:K){
value = inv_logit(1.7 * lambda1[k] * (theta[i] - lambda0[k]));
alpha[i, 1, k] ~ bernoulli(value);
}
for (s in 1:MAXSKILLS){
idx = idxY[i,1,s];
if (idx <= K){
py[i, 1, idx] = pow(ss[idx], alpha[i,1,idx]) * pow(g[idx], (1-alpha[i,1,idx]));
}
}
// t = 2, 3 ..... T[i]
for (t in 2:T[i]){
for (k in 1:K){
value = alpha[i, t-1, k];
if (value == 1){
alpha[i, t, k] ~ bernoulli(1);
}
else {
alpha[i, t, k] ~ bernoulli(learn[k]);
}
}
for (s in 1:MAXSKILLS){
idx = idxY[i,t,s];
if (idx <= K){
py[i, t, idx] = pow(ss[idx], alpha[i,t,idx]) * pow(g[idx], (1-alpha[i,t,idx]));
}
}
}
for (t in 1:T[i]){
for (s in 1:MAXSKILLS){
idx = idxY[i,t,s];
if (idx <= K){
y[i,t,idx] ~ bernoulli(py[i, t, idx]);
}
}
}
}
}
generated quantities {}
The model compiles fine but when I run the block of code below, I get the error.
hotDINA_fit = hotDINA.sampling(data={'I': 3,
'K': 22,
'max_T': max_T,
'T': T,
'MAXSKILLS': 4,
'idxY': idxY,
'y': obsY
},
iter=1000,
chains=4,
warmup=500,
n_jobs=1,
seed=42)
T is an np.ndarray with shape (I,)
idxY and obsY are both int np.ndarray with shape (I, max_T, 4)
of type
Am I going wrong when I am feeding in the data to the program?
Traceback:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-29-581607bd46ce> in <module>
12 warmup=500,
13 n_jobs=1,
---> 14 seed=42)
D:\anaconda3\envs\stanenv\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
776 call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
777 call_sampler_star = self.module._call_sampler_star
--> 778 ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
779 samples = [smpl for _, smpl in ret_and_samples]
780
D:\anaconda3\envs\stanenv\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
88 pool.join()
89 else:
---> 90 map_result = list(map(function, args))
91 return map_result
92
stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236.pyx in stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236._call_sampler_star()
stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236.pyx in stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236._call_sampler()
RuntimeError: Initialization failed.