I’m trying to perform linear regression using some synthetic data that I generated. When I try to fit the model, one chain typically fails to run (see output below) though once in a while all four chains finish. The model, python code, and synthetic data can be download as a zip at this link. This seems like a simple example so I’d like to understand why it’s failing.
My model is
data {
int<lower=0> N;
vector[N] p1;
vector[N] sex;
}
parameters {
real alpha;
real beta;
real<lower=0> sigma;
}
model {
p1 ~ normal(alpha + beta * sex, sigma);
}
and my python code is
import pandas as pd
import pystan
data = pd.read_table('example.tsv', index_col=0)
data.ix[data.sex == 1, 'p1'] += 1
data['p1'] += 2
p1_dat = {'N': data.shape[0],
'p1': data.p1.values,
'sex': data.sex.values,
}
fit = pystan.stan(file='model.stan', data=p1_dat, iter=1000, chains=4)
print('finished')
Python info:
$ python --version
Python 2.7.13 :: Anaconda 4.3.0 (x86_64)
I’m on OSX 10.12.5. I installed pystan using conda:
>>> pystan.__version__
'2.14.0.0'
Output:
$ python model.py
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_57b35883c13d5c0484d7cd7da3f6582c NOW.
Iteration: 1 / 1000 [ 0%] (Warmup) (Chain 0)
Iteration: 1 / 1000 [ 0%] (Warmup) (Chain 1)
Iteration: 1 / 1000 [ 0%] (Warmup) (Chain 2)
Iteration: 1 / 1000 [ 0%] (Warmup) (Chain 3)
Iteration: 100 / 1000 [ 10%] (Warmup) (Chain 1)
Iteration: 100 / 1000 [ 10%] (Warmup) (Chain 2)
Iteration: 100 / 1000 [ 10%] (Warmup) (Chain 3)
Iteration: 200 / 1000 [ 20%] (Warmup) (Chain 1)
Iteration: 200 / 1000 [ 20%] (Warmup) (Chain 2)
Iteration: 200 / 1000 [ 20%] (Warmup) (Chain 3)
Iteration: 300 / 1000 [ 30%] (Warmup) (Chain 3)
Iteration: 300 / 1000 [ 30%] (Warmup) (Chain 1)
Iteration: 300 / 1000 [ 30%] (Warmup) (Chain 2)
Iteration: 400 / 1000 [ 40%] (Warmup) (Chain 3)
Iteration: 400 / 1000 [ 40%] (Warmup) (Chain 2)
Iteration: 400 / 1000 [ 40%] (Warmup) (Chain 1)
Iteration: 500 / 1000 [ 50%] (Warmup) (Chain 2)
Iteration: 501 / 1000 [ 50%] (Sampling) (Chain 2)
Iteration: 500 / 1000 [ 50%] (Warmup) (Chain 1)
Iteration: 500 / 1000 [ 50%] (Warmup) (Chain 3)
Iteration: 501 / 1000 [ 50%] (Sampling) (Chain 3)
Iteration: 501 / 1000 [ 50%] (Sampling) (Chain 1)
Iteration: 600 / 1000 [ 60%] (Sampling) (Chain 1)
Iteration: 600 / 1000 [ 60%] (Sampling) (Chain 2)
Iteration: 600 / 1000 [ 60%] (Sampling) (Chain 3)
Iteration: 700 / 1000 [ 70%] (Sampling) (Chain 1)
Iteration: 700 / 1000 [ 70%] (Sampling) (Chain 2)
Iteration: 700 / 1000 [ 70%] (Sampling) (Chain 3)
Iteration: 800 / 1000 [ 80%] (Sampling) (Chain 1)
Iteration: 800 / 1000 [ 80%] (Sampling) (Chain 2)
Iteration: 800 / 1000 [ 80%] (Sampling) (Chain 3)
Iteration: 900 / 1000 [ 90%] (Sampling) (Chain 1)
Iteration: 900 / 1000 [ 90%] (Sampling) (Chain 2)
Iteration: 900 / 1000 [ 90%] (Sampling) (Chain 3)
Iteration: 1000 / 1000 [100%] (Sampling) (Chain 2)
#
# Elapsed Time: 0.282239 seconds (Warm-up)
# 0.277842 seconds (Sampling)
# 0.560081 seconds (Total)
#
Iteration: 1000 / 1000 [100%] (Sampling) (Chain 1)
#
# Elapsed Time: 0.28384 seconds (Warm-up)
# 0.284098 seconds (Sampling)
# 0.567938 seconds (Total)
#
Iteration: 1000 / 1000 [100%] (Sampling) (Chain 3)
#
# Elapsed Time: 0.281971 seconds (Warm-up)
# 0.317535 seconds (Sampling)
# 0.599506 seconds (Total)