Simple hierarchical model gets PARSER EXPECTED error

This is my first time programming in pystan and first-ever multilevel model. I have data of 10 subjects, in 2 conditions, and each subject performs 40 trials each in a given condition. I am trying to instantiate a multilevel model for condition level distribution parameter values as well as each individual parameters across two conditions.
In python I use following data dictionary:

sdata = {'N':len(y), 'J':nSubject,'id':subject,'K':nCond,'cid':cond,'y':y}

But in doing so I am stuck at:
PARSER EXPECTED: <one or more container indexes followed by ‘]’> error on y[i] ~ gamma(alpha[id[i],cid[i]],beta[id[i],cid[i]]) as well as I do not understand what is the correct way to follow!
below is my stan code:

data {
    int<lower = 0> N; //number of observations
    int<lower = 1> J; //number of subjects
    int<lower = 1> K; //number of conditions
    vector<lower = 1>[N] id; //subject id vector
    vector<lower = 1>[N] cid; //condition id vector
    vector[N] y; //response
}

parameters{
    vector<lower=0>[K] alpha[J]; 
    vector<lower=0>[K] beta[J]; 
        
    vector<lower=0>[K] mu; //hyper prior on alpha
    vector<lower=0>[K] sigma; //hyper prior on alpha
    
 
    vector<lower =0.1,upper=100>[K] lower_b; //hyper prior on beta 
    vector<lower =0.1,upper=100>[K] upper_b; //hyper prior on beta

    
}

model{
    
    //likelihood
    for (i in 1:N){
    y[i] ~ gamma(alpha[id[i],cid[i]],beta[id[i],cid[i]]);}
    
    //prior
    for (k in 1:K){
        alpha[k] ~ normal(mu,sigma);
        beta[k] ~ gamma(lower_b,upper_b);
        }
    //hyperprior
    mu ~ normal(7,1);
    sigma ~ uniform(.1,100);
    
    lower_b ~ uniform(.1,100);
    upper_b ~ uniform(.1,100);
    
}

I further tried to just model one condition hierarchically and still get the same error on y…
following is my stan code:

data{
    int<lower=0> N; //number of observations
    int<lower=1> J; // number of subject 
    real<lower=1>id[N]; // subject id vector
    real<lower=0>y[N]; // response
}

parameters{
    real alpha[J];
    real beta[J];
    
    real mu;
    real<lower=0.1,upper=100> sigma;
    real<lower=0.1,upper=100> alpha_b;
    real<lower=0.1,upper=100> beta_b;
    
}

model{
    for (i in 1:N)
        {y[i] ~ gamma(alpha[id[N]],beta[id[N]])}
    for (i in 1:J)
    {alpha[i] ~ normal(mu,sigma)
    beta[i] ~ gamma(alpha_b,beta_b)}
    mu ~ normal(6,5)
}

Hi sranj,

The relevant part of the error message is:

Container index must be integer; found type=real

This is because you’ve declared id and cid as vector, which implies that they hold real values. For indexing containers you need to pass int values. For your model, you just need to change their declarations to:

    int<lower = 1> id[N]; //subject id vector
    int<lower = 1> cid[N]; //condition id vector

Thanks @andrjohns, I tried what you said while just modeling the response of 1 condition, but now got stuck with a RuntimeError: Initialization failed.
The model code is as follows:

data{
    int<lower=0> N; //number of observations
    int<lower=1> J; // number of subject 
    int<lower=1> id[N]; // subject id vector
    real<lower=0> y[N]; // response
}

parameters{
    real alpha[J];
    real beta[J];
    
    real mu;
    real<lower=0.1,upper=100> sigma;
    real<lower=0.1,upper=100> alpha_b;
    real<lower=0.1,upper=100> beta_b;
    
}

model{
    for (i in 1:N)
        {y[i] ~ gamma(alpha[id[N]],beta[id[N]]);}
    for (i in 1:J)
    {alpha[i] ~ normal(mu,sigma);
    beta[i] ~ gamma(alpha_b,beta_b);}
    mu ~ normal(6,5);
}

Full traceback call:

RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\saurabh\Anaconda3\envs\stan\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\saurabh\Anaconda3\envs\stan\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_1126835550526207551.pyx", line 373, in stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_1126835550526207551._call_sampler_star
  File "stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_1126835550526207551.pyx", line 406, in stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_1126835550526207551._call_sampler
RuntimeError: Initialization failed.
"""

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
<ipython-input-109-13ac769aceff> in <module>
      1 smc1 = ps.StanModel(model_code=mod)
----> 2 fit_c1 = smc1.sampling(data=c1data,iter=2000,warmup=1000,seed=101)

~\Anaconda3\envs\stan\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    811         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    812         call_sampler_star = self.module._call_sampler_star
--> 813         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    814         samples = [smpl for _, smpl in ret_and_samples]
    815 

~\Anaconda3\envs\stan\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
     83         try:
     84             pool = multiprocessing.Pool(processes=n_jobs)
---> 85             map_result = pool.map(function, args)
     86         finally:
     87             pool.close()

~\Anaconda3\envs\stan\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
    264         in a list that is returned.
    265         '''
--> 266         return self._map_async(func, iterable, mapstar, chunksize).get()
    267 
    268     def starmap(self, func, iterable, chunksize=None):

~\Anaconda3\envs\stan\lib\multiprocessing\pool.py in get(self, timeout)
    642             return self._value
    643         else:
--> 644             raise self._value
    645 
    646     def _set(self, i, obj):

RuntimeError: Initialization failed.

What about if you run with the option init=0?

I don’t know what it means but I tried that with following command:
fit_c1 = smc1.sampling(data=c1data,iter=2000,warmup=1000,seed=101,init=0)
and still got the runtime error as:

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\saurabh\Anaconda3\envs\stan\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\saurabh\Anaconda3\envs\stan\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_5213880800833134272.pyx", line 373, in stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_5213880800833134272._call_sampler_star
  File "stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_5213880800833134272.pyx", line 406, in stanfit4anon_model_50614590e3a38040376d7ac3f233b0af_5213880800833134272._call_sampler
RuntimeError: Initialization failed.
"""

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
<ipython-input-110-e90a33300d88> in <module>
      1 smc1 = ps.StanModel(model_code=mod)
----> 2 fit_c1 = smc1.sampling(data=c1data,iter=2000,warmup=1000,seed=101,init=0)

~\Anaconda3\envs\stan\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    811         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    812         call_sampler_star = self.module._call_sampler_star
--> 813         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    814         samples = [smpl for _, smpl in ret_and_samples]
    815 

~\Anaconda3\envs\stan\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
     83         try:
     84             pool = multiprocessing.Pool(processes=n_jobs)
---> 85             map_result = pool.map(function, args)
     86         finally:
     87             pool.close()

~\Anaconda3\envs\stan\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
    264         in a list that is returned.
    265         '''
--> 266         return self._map_async(func, iterable, mapstar, chunksize).get()
    267 
    268     def starmap(self, func, iterable, chunksize=None):

~\Anaconda3\envs\stan\lib\multiprocessing\pool.py in get(self, timeout)
    642             return self._value
    643         else:
--> 644             raise self._value
    645 
    646     def _set(self, i, obj):

RuntimeError: Initialization failed.

Do you think my parameters and model are correct?

Oh just saw the problem, in your loop for y:

model{
    for (i in 1:N)
        {y[i] ~ gamma(alpha[id[N]],beta[id[N]]);}
}

By writing id[N] you’re just selecting the N-th value of id for every iteration of the loop, instead you should write:

model{
    for (i in 1:N)
        {y[i] ~ gamma(alpha[id[i]],beta[id[i]]);}
}

However, to simplify and speed-up your model, you should take advantage of Stan’s vectorisation rather than writing loops. So you could instead write your model block as:

model{
    mu ~ normal(6,5);
    alpha ~ normal(mu,sigma);
    beta ~ gamma(alpha_b,beta_b);}
    y ~ gamma(alpha[id],beta[id]);
}

thanks @andrjohns, i ran the model code as:

    mu ~ normal(6,5);
    alpha ~ normal(mu,sigma);
    beta ~ gamma(alpha_b,beta_b);
    for (i in 1:N)
        {y[i] ~ gamma(alpha[id[i]],beta[id[i]]);}
    

and also the optimized version that you suggested too…
I am still getting the same RuntimeError: Initialization failed.
I even increased the iter from 2000 to 5000 but that too didn’t helped.

There are two approaches you can take to debugging this. Firstly, try putting putting smaller priors on your parameters (i.e., give mu a normal(0,1) prior and put prior distributions on alpha_b and beta_b). It might be that your current priors are resulting in admissable values somewhere, so smaller priors can help diagnose this.

If that doesn’t work, try building up your model from a simple (non-hierarchical) model and iteratively adding components. This will help you identify which part of your model is causing the initialisation to fail

I did a non-hierarchical model as follows:

data{
    int<lower=0> N; //number of observations

    vector<lower=0>[N] y; // response
}

parameters{

    real <lower=0> alpha;
    real <lower=0> beta;
}

model{
     
    alpha ~ normal(3,0.5);
    beta ~ uniform(1,3);
    y ~ gamma(alpha,beta);


}

and got the following warnings:

INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_0cc2b9d3647b74002d72c834a9c9870a NOW.
WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed
WARNING:pystan:9230 of 12000 iterations ended with a divergence (76.9 %).
WARNING:pystan:Try running with adapt_delta larger than 0.8 to remove the divergences.
WARNING:pystan:Chain 1: E-BFMI = 0.0108
WARNING:pystan:Chain 4: E-BFMI = 2.44e-05
WARNING:pystan:E-BFMI below 0.2 indicates you may need to reparameterize your model

with R-hat of 1.6 and 1.15 for alpha and beta respectively.

This is more specific to your data and your model. It might be that a gamma distribution or these priors aren’t appropriate for your data. You can try different distributions or priors (as consistent with your theory).

For simple hierarchical models like these, it’s often faster and simpler to use either the rstanarm or brms packages to specify and test alternatives.