RuntimeError: Initialization failed

Hi All,

Im new to stan and running a tutorial on a CLV model found here:

Running python 3.6 on a conda environment.

I cannot work out whats going wrong when I execute the model. Keep Getting this error (bottom piece only)

File “/Users/garforbe/anaconda3/lib/python3.6/multiprocessing/pool.py”, line 644, in get
raise self._value

RuntimeError: Initialization failed.

Building the date object as below:

here’s the data we will provide to STAN :

data={‘n_cust’:len(rfm),
‘x’:rfm[‘frequency’].values,
‘tx’:rfm[‘recency’].values,
‘T’:rfm[‘T’].values
}

Then using the cache method and execution (here is where it fails). Model is below.

iterations = 1000
warmup = 500

I recommend training for several 1000’s iterations. Here we run the STAN model :

pareto_nbd_fit = stan_cache(paretonbd_model, model_name=‘paretonbd_model’,
data=data, chains=1, iter=iterations, warmup=warmup)

data{
int<lower=0> n_cust; //number of customers 
vector<lower=0>[n_cust] x; 
vector<lower=0>[n_cust] tx; 
vector<lower=0>[n_cust] T; 
}

parameters{
// vectors of lambda and mu for each customer. 
// Here I apply limits between 0 and 1 for each 
// parameter. A value of lambda or mu > 1.0 is unphysical 
// since you don't enough time resolution to go less than 
// 1 time unit. 
vector <lower=0,upper=1.0>[n_cust] lambda; 
vector <lower=0,upper=1.0>[n_cust] mu;

// parameters of the prior distributions : r, alpha, s, beta. 
// for both lambda and mu
real <lower=0>r;
real <lower=0>alpha;
real <lower=0>s;
real <lower=0>beta;
}

model{

// temporary variables : 
vector[n_cust] like1; // likelihood
vector[n_cust] like2; // likelihood 

// Establishing hyperpriors on parameters r, alpha, s, and beta. 
r ~ normal(0.5,0.1);
alpha ~ normal(10,1);
s ~ normal(0.5,0.1);
beta ~ normal(10,1);

// Establishing the Prior Distributions for lambda and mu : 
lambda ~ gamma(r,alpha); 
mu ~ gamma(s,beta);

// The likelihood of the Pareto/NBD model : 
like1 = x .* log(lambda) + log(mu) - log(mu+lambda) - tx .* (mu+lambda);
like2 = (x + 1) .* log(lambda) - log(mu+lambda) - T .* (lambda+mu);

// Here we increment the log probability density (target) accordingly 
target+= log(exp(like1)+exp(like2));
}


I have a feeling its something with the data setup but im not entirely sure, any help would be so appreciated!

Regards

Updated my data object as a test: seems to have run further but still get an error:

here’s the data we will provide to STAN :

data={'n_cust':len(rfm),
    'x':rfm['frequency'].values.tolist(),
    'tx':rfm['recency'].values.tolist(),
    'T':rfm['T'].values.tolist()
} 


  File "<ipython-input-39-30a0b1a65a2c>", line 1, in <module>
    pareto_nbd_fit = stan_cache(paretonbd_model, model_name='paretonbd_model',                                   data=data, chains=1, iter=iterations, warmup=warmup)

  File "<ipython-input-37-c77f7cab1c58>", line 16, in stan_cache
    return sm.sampling(**kwargs)

  File "/Users/garforbe/anaconda3/lib/python3.6/site-packages/pystan/model.py", line 776, in sampling
    ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)

  File "/Users/garforbe/anaconda3/lib/python3.6/site-packages/pystan/model.py", line 91, in _map_parallel
    map_result = list(map(function, args))

  File "stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438.pyx", line 370, in stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438._call_sampler_star

  File "stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438.pyx", line 403, in stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438._call_sampler

RuntimeError: Initialization failed.

From you error above it might be a failure in initializing parallelization. Have you tried calling samling with n_jobs = 1 to disable parallel processing and see if it runs?