RuntimeError: Initialization failed

I’m new to PyStan (or even Stan) and I was trying to convert my PyMC model into PyStan just to check if PyStan is any faster/robust for Bayesian time series analysis especially for large data. However, as soon as i started the sampling for MCMC, I ran into RuntimeError: Initialization failed. which I suppose has more to do with Stan than PyStan? I am going wrong but not sure where/why this happens.


Stan model:

data {
    int I;       // Num. students
    int K;       // Num. skills
    int max_T;   // largest number in T
    int T[I]; // #opportunities for each of the I students
    int MAXSKILLS;
    int idxY[I,max_T,MAXSKILLS];
    int y[I,max_T,MAXSKILLS];   // output
}
parameters {
    // Priors
    vector[I] theta;
    vector[K] lambda0;
    vector[K] lambda1;
    vector[K] learn;
    vector[K] g;
    vector[K] ss;
}
transformed parameters {}
model {
    int alpha[I,max_T,K];
    real value;
    int idx;
    real py[I, max_T, K];
    
    theta ~ normal(0, 1);
    lambda0 ~ uniform(0.0, 2.5);
    lambda1 ~ normal(0, 1);
    learn ~ beta(1, 1);
    ones ~ bernoulli(1.0);
    ss ~ uniform(0.5, 1.0);
    g ~ uniform(0.0, 0.5);
    
    for (i in 1:I){
        // t = 1
        for (k in 1:K){
            value = inv_logit(1.7 * lambda1[k] * (theta[i] - lambda0[k]));
            alpha[i, 1, k] ~ bernoulli(value);
        }
        
        for (s in 1:MAXSKILLS){
            idx = idxY[i,1,s];
            if (idx <= K){
                py[i, 1, idx] = pow(ss[idx], alpha[i,1,idx]) * pow(g[idx], (1-alpha[i,1,idx]));
            }
        }
        
        // t = 2, 3 ..... T[i]
        for (t in 2:T[i]){
            for (k in 1:K){
                value = alpha[i, t-1, k];
                if (value == 1){
                    alpha[i, t, k] ~ bernoulli(1);
                }
                else {
                    alpha[i, t, k] ~ bernoulli(learn[k]);
                }
            }
            for (s in 1:MAXSKILLS){
                idx = idxY[i,t,s];
                if (idx <= K){
                    py[i, t, idx] = pow(ss[idx], alpha[i,t,idx]) * pow(g[idx], (1-alpha[i,t,idx]));
                }
            }
        }
        
        for (t in 1:T[i]){
            for (s in 1:MAXSKILLS){
                idx = idxY[i,t,s];
                if (idx <= K){
                    y[i,t,idx] ~ bernoulli(py[i, t, idx]);
                }
            }
        }
        
    }
}
generated quantities {}
 

The model compiles fine but when I run the block of code below, I get the error.

hotDINA_fit = hotDINA.sampling(data={'I': 3,
                                     'K': 22,
                                     'max_T': max_T,
                                     'T': T,
                                     'MAXSKILLS': 4,
                                     'idxY': idxY,
                                     'y': obsY
                                     },
                               iter=1000, 
                               chains=4, 
                               warmup=500, 
                               n_jobs=1, 
                               seed=42)

T is an np.ndarray with shape (I,)
idxY and obsY are both int np.ndarray with shape (I, max_T, 4) of type
Am I going wrong when I am feeding in the data to the program?


Traceback:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-581607bd46ce> in <module>
     12                                warmup=500,
     13                                n_jobs=1,
---> 14                                seed=42)

D:\anaconda3\envs\stanenv\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    776         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    777         call_sampler_star = self.module._call_sampler_star
--> 778         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    779         samples = [smpl for _, smpl in ret_and_samples]
    780 

D:\anaconda3\envs\stanenv\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
     88             pool.join()
     89     else:
---> 90         map_result = list(map(function, args))
     91     return map_result
     92 

stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236.pyx in stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236._call_sampler_star()

stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236.pyx in stanfit4hotDINA_3c07a11e35e886cc4ebd1d6922805c31_4995909647546638236._call_sampler()

RuntimeError: Initialization failed.

Indeed, you should constrain the support of the parameters to match the priors you are using, otherwise the initialiser will propose initial values in invalid parts of the parameter space:

parameters {
    // Priors
    vector[I] theta;
    vector<lower = 0, upper = 2.5>[K] lambda0;
    vector[K] lambda1;
    vector<lower = 0, upper = 1>[K] learn;
    vector<lower = 0, upper = 0.5>[K] g;
    vector<lower = 0.5, upper = 1>[K] ss;
}

I have no idea if the python types match the suggested data types - I think you’d get a different error if they were incompatible.

Hey, I tried this but I get the exact same error again.

Some other things to try:

  • Set init_r in your call to sampling to a very small value like 1e-5 or 0 (the values supplied to pow and inv_logit might be too large).
  • Replace everything in the model block with target += std_normal_lpdf(theta); to check to see the model works, and that the data are supplied in the correct format.
  • The run time error suggests you are working on windows. Try run a simple example (that is known to work) to ensure your computing environment / Stan setup is functioning.

I tried all the thing you have mentioned in your previous comment, but this doesn’t solve the error either. Also, I’m sure that the stan setup is functioning properly since I tried out simpler examples and it gives the plots, summary stats, results etc… However, I noticed something which might be causing this issue.
For example, this model gives me the same error:

model {
    
    int alpha[I, K];
    int value;
    
    theta ~ normal(0, 1);
    lambda0 ~ uniform(0.0, 2.5);
    lambda1 ~ normal(0, 1);
    learn ~ beta(1, 1);
    ss ~ uniform(0.5, 1.0);
    g ~ uniform(0.0, 0.5);
    
    for (i in 1:I){
        for (t in 1:T[i]){
            for (k in 1:K){
                value ~ bernoulli(inv_logit(1.7 * lambda1[k] * (theta[i] - lambda0[k])));
                y[i, t, k] ~ bernoulli(value);
            }
        }
    }
}

In the above snippet, I’m trying to do int ~ bernoulli(int) instead of real ~ bernoulli(int) when I am doing y[i, t, k] ~ bernoulli(value). However, the model compiles and
hotDINA = pystan.model.StanModel(model_code=stan_model, model_name="hotDINA") does not give an error (shouldn’t the compiler see this as a problem and throw an error instead of the sampling line saying “Initialisation error” ?).
But the sampling line gives me an error saying initialisation error.
Is there a way to type cast value into real type so I can do something like y[i,t,k] ~ bernoulli(real(value)) instead?
In case, there is no direct way to typecast, how do I overcome this?

Oh, model has a latent discrete parameter – I missed it the first time around. You will need to marginalise/integrate value out of the model. I would consult chapter 7 of the user guide: https://mc-stan.org/docs/2_23/stan-users-guide/latent-discrete-chapter.html, and probably chapter 9 (for more examples of marginalisation).

Thanks, this is really helpful. I understand the marginalizing part, but how do I identify which ones are “latent discrete parameters” in my model (the very first code snippet I pasted in this thread)? I do not understand on what basis “s” was chosen as a latent discrete parameter in the link that you mentioned.

Is it any parameter in the model that is “hidden” and discrete (like an integer)?

Yes – anything that is an integer and would go on the right hand side of a sampling statement. In your first example this would be alpha.

1 Like

Thanks, that’s really helpful. I tried to marginalize alpha in my initial model mentioned in this thread.
I have posted how I marginalized alpha over here: bayesian - Marginalizing a time-series model in Stan - Cross Validated

Does my approach for marginalizing seem valid to you?

Here’s my implementation of the stan model for the same:

data {
    int I;                          // Num. students
    int K;                          // Num. skills
    int max_T;                      // largest number in T
    int T[I];                       // #opportunities for each of the I students
    int MAXSKILLS;
    int idxY[I,max_T,MAXSKILLS];
    int y[I,K,max_T];               // output
}
parameters {
    vector[I] theta;
    vector<lower = 0, upper = 2.5>[K] lambda0;
    vector[K] lambda1;
    vector<lower = 0, upper = 1>[K] learn;
    vector<lower = 0, upper = 0.5>[K] g;
    vector<lower = 0.5, upper = 1>[K] ss;
}
transformed parameters {
    real lp[I, K, max_T];
    real G[I,K,max_T];
    real S[I,K,max_T];
    real value[I,K];
    real expsum = 0.0;
    
    for (i in 1:I){
        for (k in 1:K){
            for (t in 1:T[i]) {
                lp[i,k,t] = 0.0;
            }
        }
    } 
    
    for (i in 1:I){
        for (k in 1:K){
            value[i,k] = inv_logit(1.7 * lambda1[k] * (theta[i] - lambda0[k]));
            G[i,k,1] = pow(g[k], y[i,k,1]) * pow(1-g[k], 1-y[i,k,1]);
            S[i,k,1] = pow(ss[k], y[i,k,1]) * pow(1-ss[k], 1-y[i,k,1]);
            
            for (t in 2:T[i]){
                G[i,k,t] = pow(g[k], y[i,k,t]) * pow(1-g[k], 1-y[i,k,t]);
                S[i,k,1] = pow(ss[k], y[i,k,t]) * pow(1-ss[k], 1-y[i,k,t]);
            }
        } 
    }
    
    for (i in 1:I){
        for (k in 1:K){
            lp[i,k,1] = G[i,k,1] + value[i,k] * (S[i,k,1] - G[i,k,1]);
            expsum += exp(lp[i,k,1]);
        }
    }
    
    for (i in 1:I){
        for (k in 1:K){
            for (t in 2:T[i]) {
                lp[i,k,t] = S[i,k,t] + ((G[i,k,t] - S[i,k,t]) * (1-value[i,k]) * pow(1-learn[k], t) );
                expsum += exp(lp[i,k,t]);
            }
        }
    }
    expsum = log(expsum);
}
model {
    
    theta ~ normal(0, 1);
    lambda0 ~ uniform(0.0, 2.5);
    lambda1 ~ normal(0, 1);
    learn ~ beta(1, 1);
    ss ~ uniform(0.5, 1.0);
    g ~ uniform(0.0, 0.5);
        
    target += expsum;
}
generated quantities {}
"""

The model compiles fine but sampling still gives me this error:

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "D:\anaconda3\envs\stanenv\lib\multiprocessing\pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "D:\anaconda3\envs\stanenv\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "stanfit4hotDINA_0c6f4ec79e220dad7719069124ca4383_5136997580262565025.pyx", line 371, in stanfit4hotDINA_0c6f4ec79e220dad7719069124ca4383_5136997580262565025._call_sampler_star
  File "stanfit4hotDINA_0c6f4ec79e220dad7719069124ca4383_5136997580262565025.pyx", line 404, in stanfit4hotDINA_0c6f4ec79e220dad7719069124ca4383_5136997580262565025._call_sampler
RuntimeError: Initialization failed.
"""

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
<ipython-input-12-7d2fdd7ef74f> in <module>
      9                                iter=1000,
     10                                chains=4,
---> 11                                warmup=500)

D:\anaconda3\envs\stanenv\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    776         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    777         call_sampler_star = self.module._call_sampler_star
--> 778         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    779         samples = [smpl for _, smpl in ret_and_samples]
    780 

D:\anaconda3\envs\stanenv\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
     83         try:
     84             pool = multiprocessing.Pool(processes=n_jobs)
---> 85             map_result = pool.map(function, args)
     86         finally:
     87             pool.close()

D:\anaconda3\envs\stanenv\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):

D:\anaconda3\envs\stanenv\lib\multiprocessing\pool.py in get(self, timeout)
    655             return self._value
    656         else:
--> 657             raise self._value
    658 
    659     def _set(self, i, obj):

RuntimeError: Initialization failed.

Could you please tell me where I’m going wrong?

At a first glance it looks okay, but I’d need to spend a lot more time doing the maths to be sure (marginalising / integrating out parameters is a bit tricky).

Some other things I would try:

  • I’m concerned that value often contains exactly 0 or 1 because inv_logit underflows/overflows.
  • I think you should be getting a more informative error message from pystan about what parameter failed to initialise, so I’m going to tag @ahartikainen – I (think) usually get something more informative about which parameter is failing to initialise from rstan?
  • Can you share the data you are supplying to this model?

Hi,

are you running your model on Jupyter or in a script/terminal?

If in jupyter, see C++ messages in the terminal window and if you run with terminal, then add verbose=True to see C++ messages.