RuntimeError: Initialization failed

forbesg · September 20, 2018, 8:55am

Hi All,

Im new to stan and running a tutorial on a CLV model found here:

datascienceinc/oreilly-intro-to-predictive-clv/blob/master/oreilly-an-intro-to-predictive-clv-tutorial.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "_datascience": {}
   },
   "source": [
    "# An Intro to Predictive Modeling for Customer Lifetime Value (CLV) -- Tutorial Notebook \n",
    "\n",
    "In this notebook, you will be introduced to the workflow necessary to train a Pareto/NBD model (e.g. Schmittlein et al. 1987) on a transactional dataset. An extension to the Pareto/NBD model includes predictions for the monetary value as well (Gamma-Gamma model -- Fader et al. 2004). \n",
    "\n",
    "The Pareto/NBD model is a good introductory probabilistic model to the non-contractual setting with continous purchase opportunity. It's a simple enough model that is easy to train and generally produces good results when the assumptions behind the model are met. It's a good first shot at CLV modeling ! \n",
    "\n",
    "\n",
    "## A few words on the CDNOW Dataset \n",
    "The CDNOW dataset is a very popular dataset used in academic papers addressing CLV models. CDNOW used to be an online retailer of CDs in the 1990's. The dataset in question includes the transactional data of a cohort of customers who have made their first purchase in the first quarter of 1997. All transactions from these customers between their purchase and June 1998 are included. The transactional data was downsampled to contain transactions of 10% of the customers population (2357 customers). \n",
    "\n",
    "The CDNOW dataset is a good example of a non-contractual setting with a continuous purchasing opportunity. It has been used extensively in the CLV literature.\n",
    "\n",

This file has been truncated. show original

Running python 3.6 on a conda environment.

I cannot work out whats going wrong when I execute the model. Keep Getting this error (bottom piece only)

File “/Users/garforbe/anaconda3/lib/python3.6/multiprocessing/pool.py”, line 644, in get
raise self._value

RuntimeError: Initialization failed.

Building the date object as below:

here’s the data we will provide to STAN :

data={‘n_cust’:len(rfm),
‘x’:rfm[‘frequency’].values,
‘tx’:rfm[‘recency’].values,
‘T’:rfm[‘T’].values
}

Then using the cache method and execution (here is where it fails). Model is below.

iterations = 1000
warmup = 500

I recommend training for several 1000’s iterations. Here we run the STAN model :

pareto_nbd_fit = stan_cache(paretonbd_model, model_name=‘paretonbd_model’,
data=data, chains=1, iter=iterations, warmup=warmup)

data{
int<lower=0> n_cust; //number of customers 
vector<lower=0>[n_cust] x; 
vector<lower=0>[n_cust] tx; 
vector<lower=0>[n_cust] T; 
}

parameters{
// vectors of lambda and mu for each customer. 
// Here I apply limits between 0 and 1 for each 
// parameter. A value of lambda or mu > 1.0 is unphysical 
// since you don't enough time resolution to go less than 
// 1 time unit. 
vector <lower=0,upper=1.0>[n_cust] lambda; 
vector <lower=0,upper=1.0>[n_cust] mu;

// parameters of the prior distributions : r, alpha, s, beta. 
// for both lambda and mu
real <lower=0>r;
real <lower=0>alpha;
real <lower=0>s;
real <lower=0>beta;
}

model{

// temporary variables : 
vector[n_cust] like1; // likelihood
vector[n_cust] like2; // likelihood 

// Establishing hyperpriors on parameters r, alpha, s, and beta. 
r ~ normal(0.5,0.1);
alpha ~ normal(10,1);
s ~ normal(0.5,0.1);
beta ~ normal(10,1);

// Establishing the Prior Distributions for lambda and mu : 
lambda ~ gamma(r,alpha); 
mu ~ gamma(s,beta);

// The likelihood of the Pareto/NBD model : 
like1 = x .* log(lambda) + log(mu) - log(mu+lambda) - tx .* (mu+lambda);
like2 = (x + 1) .* log(lambda) - log(mu+lambda) - T .* (lambda+mu);

// Here we increment the log probability density (target) accordingly 
target+= log(exp(like1)+exp(like2));
}

I have a feeling its something with the data setup but im not entirely sure, any help would be so appreciated!

Regards

forbesg · September 20, 2018, 9:06am

Updated my data object as a test: seems to have run further but still get an error:

here’s the data we will provide to STAN :

data={'n_cust':len(rfm),
    'x':rfm['frequency'].values.tolist(),
    'tx':rfm['recency'].values.tolist(),
    'T':rfm['T'].values.tolist()
} 


  File "<ipython-input-39-30a0b1a65a2c>", line 1, in <module>
    pareto_nbd_fit = stan_cache(paretonbd_model, model_name='paretonbd_model',                                   data=data, chains=1, iter=iterations, warmup=warmup)

  File "<ipython-input-37-c77f7cab1c58>", line 16, in stan_cache
    return sm.sampling(**kwargs)

  File "/Users/garforbe/anaconda3/lib/python3.6/site-packages/pystan/model.py", line 776, in sampling
    ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)

  File "/Users/garforbe/anaconda3/lib/python3.6/site-packages/pystan/model.py", line 91, in _map_parallel
    map_result = list(map(function, args))

  File "stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438.pyx", line 370, in stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438._call_sampler_star

  File "stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438.pyx", line 403, in stanfit4anon_model_76111598b747c5ab299e594dcfa74504_8591483796388359438._call_sampler

RuntimeError: Initialization failed.

mirkhosro · October 23, 2018, 10:21pm

From you error above it might be a failure in initializing parallelization. Have you tried calling samling with n_jobs = 1 to disable parallel processing and see if it runs?

Topic		Replies	Views
RuntimeError: Initialization failed in Python 3.7.2, pystan 2.18.1.0 Modeling	12	2018	August 1, 2020
RuntimeError: Initialization failed Modeling	10	875	July 28, 2020
Initialization failed in PyStan during covid-19 effective reproduction estimation model Modeling fitting-issues	4	860	June 8, 2020
pystan.StanModel RuntimeError: Initialization failed when trying to fit Modeling	4	1750	November 6, 2019
Reason for STAN RuntimeError: Initiation failed Modeling	2	549	March 20, 2022

RuntimeError: Initialization failed

here’s the data we will provide to STAN :

I recommend training for several 1000’s iterations. Here we run the STAN model :

here’s the data we will provide to STAN :

Related topics