Chains finish unexpectedly in new install of CmdStanR

Hi all,

I am trying to install CmdStanR on Windows 10. I have installed cmdstan via a Conda environment and can get the example model to compile but not sample. I am having very similar problems to here.

file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")

# compile model
mod <- cmdstan_model(file) # this works fine!

# sampling
data_list <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
fit <- mod$sample( # this does not!
  data = data_list, 
  seed = 123, 
  chains = 4, 
  parallel_chains = 4,
  refresh = 500
)

But this fails with the following error.

Running MCMC with 4 parallel chains...

Warning: Chain 1 finished unexpectedly!

Warning: Chain 2 finished unexpectedly!

Warning: Chain 3 finished unexpectedly!

Warning: Chain 4 finished unexpectedly!

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning messages:
1: All chains finished unexpectedly! Use the $output(chain_id) method for more information.
 
2: No chains finished successfully. Unable to retrieve the fit. 

Following the error messages I ran fit$output_files() but there were no csv files (i.e. the output read character(0). I thought maybe it had something to do with the parallel chains, but removing this option has no effect on the error message.

I have also tried

fit <- cmdstanr_example(chains = 1)

but this fails with a similar error.

Any ideas as to what could be the problem here? Thank you in advance.

My current R and system info below:

> cmdstan_path()
[1] "C:/Users/n9401849/Anaconda3/envs/stan/Library/bin/cmdstan"
> cmdstan_version()
[1] "2.30.1"
> Sys.info()
         sysname          release          version         nodename          machine 
       "Windows"         "10 x64"    "build 19044" "QUT-PA00146740"         "x86-64" 
           login             user   effective_user 
      "n9401849"       "n9401849"       "n9401849" 
> R.version
               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          4                           
minor          0.5                         
year           2021                        
month          03                          
day            31                          
svn rev        80133                       
language       R                           
version.string R version 4.0.5 (2021-03-31)
nickname       Shake and Throw

I re-installed all the latest versions yesterday as well and encounters the same error (on meanfield and sampling algo, on one chain and multiple chains).

There’s this message that popped up too:

image

Compiling Stan program...
Start sampling
Running MCMC with 4 parallel chains...

Warning: Chain 1 finished unexpectedly!

Warning: Chain 2 finished unexpectedly!

Warning: Chain 3 finished unexpectedly!

Warning: Chain 4 finished unexpectedly!

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Error in cmdstanr::read_cmdstan_csv(out$output_files(), variables = "",  : 
  Assertion on 'files' failed: No file provided.
In addition: Warning messages:
1: All chains finished unexpectedly! Use the $output(chain_id) method for more information.
 
2: No chains finished successfully. Unable to retrieve the fit. 
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_Singapore.utf8  LC_CTYPE=English_Singapore.utf8   
[3] LC_MONETARY=English_Singapore.utf8 LC_NUMERIC=C                      
[5] LC_TIME=English_Singapore.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] brms_2.17.5    Rcpp_1.0.8.3   cmdstanr_0.5.3

loaded via a namespace (and not attached):
 [1] Brobdingnag_1.2-8    jsonlite_1.8.0       gtools_3.9.3        
 [4] StanHeaders_2.21.0-7 RcppParallel_5.1.5   threejs_0.3.3       
 [7] shiny_1.7.2          assertthat_0.2.1     posterior_1.2.2     
[10] distributional_0.3.0 stats4_4.2.0         tensorA_0.36.2      
[13] pillar_1.8.0         backports_1.4.1      lattice_0.20-45     
[16] glue_1.6.2           digest_0.6.29        promises_1.2.0.1    
[19] checkmate_2.1.0      colorspace_2.0-3     htmltools_0.5.2     
[22] httpuv_1.6.5         Matrix_1.4-1         plyr_1.8.7          
[25] dygraphs_1.1.1.6     pkgconfig_2.0.3      rstan_2.21.5        
[28] purrr_0.3.4          xtable_1.8-4         mvtnorm_1.1-3       
[31] scales_1.2.0         processx_3.7.0       later_1.3.0         
[34] tibble_3.1.8         bayesplot_1.9.0      generics_0.1.3      
[37] farver_2.1.1         ggplot2_3.3.6        ellipsis_0.3.2      
[40] DT_0.23              shinyjs_2.1.0        cli_3.3.0           
[43] crayon_1.5.1         magrittr_2.0.3       mime_0.12           
[46] ps_1.7.1             fansi_1.0.3          nlme_3.1-157        
[49] xts_0.12.1           pkgbuild_1.3.1       colourpicker_1.1.1  
[52] prettyunits_1.1.1    tools_4.2.0          loo_2.5.1           
[55] lifecycle_1.0.1      matrixStats_0.62.0   stringr_1.4.0       
[58] munsell_0.5.0        callr_3.7.0          compiler_4.2.0      
[61] rlang_1.0.4          grid_4.2.0           ggridges_0.5.3      
[64] rstudioapi_0.13      htmlwidgets_1.5.4    crosstalk_1.2.0     
[67] igraph_1.3.1         miniUI_0.1.1.1       base64enc_0.1-3     
[70] codetools_0.2-18     gtable_0.3.0         inline_0.3.19       
[73] abind_1.4-5          DBI_1.1.2            markdown_1.1        
[76] reshape2_1.4.4       R6_2.5.1             gridExtra_2.3       
[79] rstantools_2.2.0     zoo_1.8-10           knitr_1.39          
[82] bridgesampling_1.1-2 dplyr_1.0.9          fastmap_1.1.0       
[85] utf8_1.2.2           shinythemes_1.2.0    shinystan_2.6.0     
[88] stringi_1.7.6        parallel_4.2.0       vctrs_0.4.1         
[91] tidyselect_1.1.2     xfun_0.31            coda_0.19-4  

Sorry to hear you’re having a similar problem @DominiqueMakowski. Following the recommendations in this post, I thought it best to tag @rok_cesnovar here as this post has been unanswered for 5 days. Apologies if the post is now outdated Rok.

I am having the same problem on HPC cluster with Linux. I am using cmdstanr version 0.5.3
and CmdStan version: 2.30.1

Running MCMC with 4 parallel chains...

Warning: Chain 3 finished unexpectedly!

Warning: Chain 2 finished unexpectedly!

Warning: Chain 1 finished unexpectedly!

Chain 4 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
.......

Does this happen for all Stan models or just for a particular one? Can you run the following example model:

cmdstanr_example("logistic", method = "sample", quiet = FALSE)
1 Like

Thanks for the quick reply.

I ran the example model and it finished successfully:

starting worker pid=336573 on localhost:11478 at 12:34:34.099
[1] 4
Running MCMC with 4 chains, at most 48 in parallel...

Chain 1 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 1 Iteration:  100 / 2000 [  5%]  (Warmup) 
Chain 1 Iteration:  200 / 2000 [ 10%]  (Warmup) 
Chain 1 Iteration:  300 / 2000 [ 15%]  (Warmup) 
Chain 1 Iteration:  400 / 2000 [ 20%]  (Warmup) 
Chain 1 Iteration:  500 / 2000 [ 25%]  (Warmup) 
Chain 1 Iteration:  600 / 2000 [ 30%]  (Warmup) 
Chain 1 Iteration:  700 / 2000 [ 35%]  (Warmup) 
Chain 1 Iteration:  800 / 2000 [ 40%]  (Warmup) 
Chain 1 Iteration:  900 / 2000 [ 45%]  (Warmup) 
Chain 1 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
Chain 1 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
Chain 1 Iteration: 1100 / 2000 [ 55%]  (Sampling) 
Chain 1 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
Chain 1 Iteration: 1300 / 2000 [ 65%]  (Sampling) 
Chain 1 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
Chain 1 Iteration: 1500 / 2000 [ 75%]  (Sampling) 
Chain 1 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
Chain 1 Iteration: 1700 / 2000 [ 85%]  (Sampling) 
Chain 1 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
Chain 1 Iteration: 1900 / 2000 [ 95%]  (Sampling) 
Chain 1 Iteration: 2000 / 2000 [100%]  (Sampling) 
Chain 2 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 2 Iteration:  100 / 2000 [  5%]  (Warmup) 
Chain 2 Iteration:  200 / 2000 [ 10%]  (Warmup) 
Chain 2 Iteration:  300 / 2000 [ 15%]  (Warmup) 
Chain 2 Iteration:  400 / 2000 [ 20%]  (Warmup) 
Chain 2 Iteration:  500 / 2000 [ 25%]  (Warmup) 
Chain 2 Iteration:  600 / 2000 [ 30%]  (Warmup) 
Chain 2 Iteration:  700 / 2000 [ 35%]  (Warmup) 
Chain 2 Iteration:  800 / 2000 [ 40%]  (Warmup) 
Chain 2 Iteration:  900 / 2000 [ 45%]  (Warmup) 
Chain 2 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
Chain 2 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
Chain 2 Iteration: 1100 / 2000 [ 55%]  (Sampling) 
Chain 2 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
Chain 2 Iteration: 1300 / 2000 [ 65%]  (Sampling) 
Chain 2 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
Chain 2 Iteration: 1500 / 2000 [ 75%]  (Sampling) 
Chain 2 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
Chain 2 Iteration: 1700 / 2000 [ 85%]  (Sampling) 
Chain 2 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
Chain 2 Iteration: 1900 / 2000 [ 95%]  (Sampling) 
Chain 2 Iteration: 2000 / 2000 [100%]  (Sampling) 
Chain 3 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 3 Iteration:  100 / 2000 [  5%]  (Warmup) 
Chain 3 Iteration:  200 / 2000 [ 10%]  (Warmup) 
Chain 3 Iteration:  300 / 2000 [ 15%]  (Warmup) 
Chain 3 Iteration:  400 / 2000 [ 20%]  (Warmup) 
Chain 3 Iteration:  500 / 2000 [ 25%]  (Warmup) 
Chain 3 Iteration:  600 / 2000 [ 30%]  (Warmup) 
Chain 3 Iteration:  700 / 2000 [ 35%]  (Warmup) 
Chain 3 Iteration:  800 / 2000 [ 40%]  (Warmup) 
Chain 3 Iteration:  900 / 2000 [ 45%]  (Warmup) 
Chain 3 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
Chain 3 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
Chain 3 Iteration: 1100 / 2000 [ 55%]  (Sampling) 
Chain 3 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
Chain 3 Iteration: 1300 / 2000 [ 65%]  (Sampling) 
Chain 3 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
Chain 3 Iteration: 1500 / 2000 [ 75%]  (Sampling) 
Chain 3 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
Chain 3 Iteration: 1700 / 2000 [ 85%]  (Sampling) 
Chain 3 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
Chain 3 Iteration: 1900 / 2000 [ 95%]  (Sampling) 
Chain 3 Iteration: 2000 / 2000 [100%]  (Sampling) 
Chain 4 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 4 Iteration:  100 / 2000 [  5%]  (Warmup) 
Chain 4 Iteration:  200 / 2000 [ 10%]  (Warmup) 
Chain 4 Iteration:  300 / 2000 [ 15%]  (Warmup) 
Chain 4 Iteration:  400 / 2000 [ 20%]  (Warmup) 
Chain 4 Iteration:  500 / 2000 [ 25%]  (Warmup) 
Chain 4 Iteration:  600 / 2000 [ 30%]  (Warmup) 
Chain 4 Iteration:  700 / 2000 [ 35%]  (Warmup) 
Chain 4 Iteration:  800 / 2000 [ 40%]  (Warmup) 
Chain 4 Iteration:  900 / 2000 [ 45%]  (Warmup) 
Chain 4 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
Chain 4 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
Chain 4 Iteration: 1100 / 2000 [ 55%]  (Sampling) 
Chain 4 Iteration: 1200 / 2000 [ 60%]  (Sampling) 
Chain 4 Iteration: 1300 / 2000 [ 65%]  (Sampling) 
Chain 4 Iteration: 1400 / 2000 [ 70%]  (Sampling) 
Chain 4 Iteration: 1500 / 2000 [ 75%]  (Sampling) 
Chain 4 Iteration: 1600 / 2000 [ 80%]  (Sampling) 
Chain 4 Iteration: 1700 / 2000 [ 85%]  (Sampling) 
Chain 4 Iteration: 1800 / 2000 [ 90%]  (Sampling) 
Chain 4 Iteration: 1900 / 2000 [ 95%]  (Sampling) 
Chain 4 Iteration: 2000 / 2000 [100%]  (Sampling) 
Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.
Chain 4 finished in 0.1 seconds.

All 4 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.5 seconds.

   variable   mean median   sd  mad     q5    q95 rhat ess_bulk ess_tail
 lp__       -65.97 -65.65 1.46 1.23 -68.80 -64.29 1.00     2112     2751
 alpha        0.38   0.38 0.22 0.22   0.03   0.73 1.00     4231     3068
 beta[1]     -0.67  -0.66 0.25 0.25  -1.08  -0.26 1.00     4380     2711
 beta[2]     -0.27  -0.27 0.22 0.22  -0.64   0.09 1.00     3819     2875
 beta[3]      0.68   0.67 0.27 0.27   0.25   1.14 1.00     3975     3173
 log_lik[1]  -0.51  -0.51 0.10 0.10  -0.69  -0.37 1.00     4178     3274
 log_lik[2]  -0.40  -0.38 0.15 0.14  -0.68  -0.20 1.00     4617     3387
 log_lik[3]  -0.50  -0.46 0.22 0.20  -0.89  -0.21 1.00     4110     3021
 log_lik[4]  -0.45  -0.43 0.15 0.14  -0.72  -0.24 1.00     3726     3085
 log_lik[5]  -1.19  -1.17 0.29 0.28  -1.68  -0.75 1.00     4578     2913

 # showing 10 of 105 rows (change via 'max_rows' argument or 'cmdstanr_max_rows' option)
Error while shutting down parallel: unable to terminate some child processes

If it is something related with my model, how can I make cmdstan print the correct error message?

The same Stan model runs locally without errors.