rstan is not happy with me. It aborts when I attempt to sample from relatively simple models. Abortions occur differently depending on how many cores or chains I request, but (at least with this data/model) no combination seems to work. One common error message is the following:
Error in unserialize(socklist[[n]]) : error reading from connection
*Error in serialize(data, node$con, xdr = FALSE) : *
- error writing to connection*
But sometimes R simply crashes. The failure appears to occur at the start of post-warmup sampling.This behavior appeared since I upgraded my RStudio to 1.2.1335, R to 3.6.0, and rstan to 2.18.2 (running under Windows 7).
Others have experienced this too over the last few years as I see in these forums. Many of us saw it a few months ago at an R class given by J Gabry. However no consistent resolution (that I can comprehend) or even a general understanding of the behavior seems available. I’ve used Stan happily and successfully for 3+ years but I am deeply challenged by concepts like “Makevars.win”.
I’m hoping some kind person with greater understanding might look at the following code and model and give me insight to avoid these abortions. I apologize if I’ve not put this in the right Forum category or improperly formatted this note. I very much appreciate any suggestions!! Thank you. :-)
Here is my R code:
rm (list = ls (all = TRUE)) # caution: removes everything
library (rstan)
rstan_options (auto_write = TRUE)
options (mc.cores = parallel::detectCores ())
dataList ← list(
nobs = 92,
nbatch = 11,
batch= c(10,10,10,10,10,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,
3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,
6,6,6,6,7,7,7,7,8,8,8,8,9,9,10,10,10,10,10,11,11,11),
X = c(0,1,2,3,6,52,52,53,35,35,35,35,35,35,35,35,35,35,35,35,35,35,36,36,36,36,36,7,7,7,7,7,7,7,7,
7,7,7,7,8,8,8,8,24,36,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,8,35,35,35,36,36,36,36,36,37,
22,22,22,36,22,22,22,36,21,22,22,36,0,12,0,1,2,3,6,68,68,69),
Y = c(95.0,104.0,111.0,103.0,97.0,92.8,104.6,104.7,107.8,101.4,93.9,101.3,93.7,97.1,99.8,102.1,102.1,
106.4,100.3,99.2,99.6,102.5,108.6,99.2,98.5,107.0,96.3,108.1,123.5,114.6,123.0,107.6,118.8,114.3,
116.4,120.1,124.3,103.7,106.2,131.3,119.7,114.0,116.4,117.0,110.0,110.0,115.8,103.9,105.2,101.3,102.7,
106.1,97.5,99.5,102.7,107.8,106.9,108.6,98.2,115.1,105.1,83.0,86.4,85.1,89.3,85.0,89.3,84.7,
92.1,76.8,100.3,110.9,94.7,97.0,92.5,88.7,86.7,86.0,85.9,86.4,78.2,81.0,106.0,100.0,95.0,
98.0,110.0,103.0,105.0,79.8,94.0,84.2),
R = c(110,117,94,90,98,100,93,105,113,103,111,112,111,121),
A = c(1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
nrel = 14)
comp_model ← stan_model(‘Stan model.stan’)
print(comp_model)
StanFit ← sampling(comp_model, data=dataList, iter=2000, chains=4)
Here is my ‘Stan model.stan’ file (which btw I have used successfully in other analyses):
data {
int<lower=1> nobs; //number of stability data points
int<lower=1> nrel; //number of release values
real Y[nobs]; //measured stability responses
real R[nrel]; //measured release responses
real X[nobs]; //storage time in months
int<lower=1> nbatch; //number of batches on stability
int<lower=1, upper=nbatch> batch[nobs]; //stability batch id
real A[nobs];
}
parameters {
vector[2] beta; //intercept and slope
real<lower=0> sigma; //error sd
vector<lower=0>[2] sigma_u; //batch intercept-slope sds
cholesky_factor_corr[2] L_u;
matrix[2,nbatch] z_u;
real delta;
}
transformed parameters{
matrix[2,nbatch] u; // stability batch effects
matrix[2,2] C;
u = diag_pre_multiply(sigma_u, L_u) * z_u; //batch random effects
C = L_u*L_u’; // correlation matrix
}
model {
real mu;
//priors
L_u ~ lkj_corr_cholesky(2.0);
to_vector(z_u) ~ normal(0,1);
//likelihood
for (i in 1:nobs){
mu = beta[1] + u[1,batch[i]] + (beta[2] + delta*A[i] + u[2,batch[i]]) * X[i];
Y[i] ~ normal(mu, sigma);
}
for (i in 1:nrel){
R[i] ~ normal(beta[1], sqrt(sigma^2 + sigma_u[1]^2));
}
}
Below is the console output (some unimportant stuff removed):
R version 3.6.0 (2019-04-26) – “Planting of a Tree”
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
rm (list = ls (all = TRUE)) # caution: removes everything
#options(max.print=100000)
library (rstan)
Loading required package: ggplot2
Registered S3 methods overwritten by ‘ggplot2’:
method from
[.quosures rlang
c.quosures rlang
print.quosures rlang
Loading required package: StanHeaders
rstan (Version 2.18.2, GitRev: 2e1f913d3ca3)
For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
For improved execution time, we recommend calling
Sys.setenv(LOCAL_CPPFLAGS = ‘-march=native’)
although this causes Stan to throw an error on a few processors.
comp_model ← stan_model(‘Stan model.stan’)
recompiling to avoid crashing R sessionStanFit ← sampling(comp_model, data=dataList, iter=2000, chains=4)
Error in unserialize(socklist[[n]]) : error reading from connection
Error in serialize(data, node$con, xdr = FALSE) :
error writing to connection