Rstan 2.26 - threads_per_chain always set to max even with rstan_options( threads_per_chain = 1 )

Hello,

a strange behaviour I find now is that threads_per_chain is always set to max even with rstan_options(threads_per_chain = 1), and the model got very slow (I’m not sure the reason being that tries to saturate all the cores).

> system("gcc --version")
gcc (GCC) 11.1.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /stornext/System/data/apps/R/R-4.1.2/lib64/R/lib/libRblas.so
LAPACK: /stornext/System/data/apps/R/R-4.1.2/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tictoc_1.0.1       cellsig_0.0.0.9000 forcats_0.5.1      stringr_1.4.0      dplyr_1.0.7        purrr_0.3.4        readr_2.1.1        tidyr_1.1.4        tibble_3.1.6       ggplot2_3.3.5     
[11] tidyverse_1.3.1    rstan_2.26.6       StanHeaders_2.26.6

loaded via a namespace (and not attached):
 [1] httr_1.4.2           foreach_1.5.1        jsonlite_1.7.2       modelr_0.1.8         RcppParallel_5.1.4   assertthat_0.2.1     posterior_1.1.0      distributional_0.2.2 stats4_4.1.2        
[10] tensorA_0.36.2       ggdist_3.0.1         cellranger_1.1.0     pillar_1.6.4         backports_1.4.1      lattice_0.20-45      glue_1.6.0           arrayhelpers_1.1-0   checkmate_2.0.0     
[19] rvest_1.0.2          colorspace_2.0-2     pkgconfig_2.0.3      broom_0.7.10         svUnit_1.0.6         haven_2.4.3          scales_1.1.1         processx_3.5.2       tzdb_0.2.0          
[28] generics_0.1.1       farver_2.1.0         ellipsis_0.3.2       withr_2.4.3          cli_3.1.0            magrittr_2.0.1       crayon_1.4.2         readxl_1.3.1         ps_1.6.0            
[37] data.tree_1.0.0      fs_1.5.2             fansi_0.5.0          xml2_1.3.3           pkgbuild_1.3.1       tools_4.1.2          loo_2.4.1            prettyunits_1.1.1    hms_1.1.1           
[46] lifecycle_1.0.1      matrixStats_0.61.0   V8_3.6.0             munsell_0.5.0        reprex_2.0.1         callr_3.7.0          compiler_4.1.2       tidybayes_3.0.1      rlang_0.4.12        
[55] grid_4.1.2           iterators_1.0.13     rstudioapi_0.13      gtable_0.3.0         codetools_0.2-18     abind_1.4-5          inline_0.3.19        DBI_1.1.1            curl_4.3.2          
[64] R6_2.5.1             gridExtra_2.3        rstantools_2.1.1     lubridate_1.8.0      utf8_1.2.2           stringi_1.7.6        parallel_4.1.2       Rcpp_1.0.7           vctrs_0.3.8         
[73] dbplyr_2.1.1         tidyselect_1.1.1     coda_0.19-4 

Makevars

CXX14FLAGS += -O3
CXX14FLAGS += -DSTAN_THREADS
CXX14FLAGS += -pthread
CXX14FLAGS += -march=native -mtune=native -fPIC
CXX14FLAGS += -DUSE_STANC3
CXX14=g++

MAKEFLAGS = -j8

Can you test the experimental version?

remove.packages(c("StanHeaders", "rstan"))
remotes::install_github("stan-dev/rstan/StanHeaders@experimental", upgrade = "always", force = TRUE)
remotes::install_github("stan-dev/rstan/rstan/rstan@experimental", upgrade = "always", force = TRUE)

Also, make sure that STAN_NUM_THREADS is not defined in your environment, or set it to 1:

Sys.getenv("STAN_NUM_THREADS")
Sys.setenv("STAN_NUM_THREADS" = 1)

Thanks,

this happened when I create a clean rstan package, rather than create the model and call it outside projects and packages.

Now I have abandoned that effort, and back to the environment is working (fast and running the cores that it should). But I will try.

A possible problem could be my messy gcc environment, in case I compile with a GCC version and run with an old one (perhaps?).

Thanks for the tip.

I upgraded to the experimental version but still

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                            
50308 mangiol+  20   0 5913516 917992  35620 R 920.6  0.2   2:09.95 session 

Using 900% of CPU with

  • one chain,
  • one core,
  • Sys.setenv(“STAN_NUM_THREADS” = 1)

That very unfortunate. It is hard to use reduce_sum with Rstan, and cmdstanr is not great for building packages based on Stan (it saves the fit onto a file, and not into the returned object).

Maybe cmdstanr can do what you want:

?

1 Like

Thanks, yes I converted my package to cmdstanr, but I hope rstan can catch up, as it is a better setup to develop stan-based R packages.

Hello, now that rstan has been updated, I was hoping that this problem would be solved, but still

rstan_options(threads_per_chain = 1)

This issue is associated with a heavy slow down of the fit (for some reason) compared with the correct behaviour from cmdstanr.

Note: I still need rstan because fits the creation of R packages for third-party users.

sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sccomp_1.7.5 dplyr_1.1.4 

loaded via a namespace (and not attached):
 [1] dotCall64_1.1-1             SummarizedExperiment_1.32.0 gtable_0.3.4                spam_2.10-0                 ggplot2_3.4.4              
 [6] QuickJSR_1.0.9              ggrepel_0.9.5               Biobase_2.62.0              inline_0.3.19               lattice_0.21-8             
[11] tzdb_0.4.0                  vctrs_0.6.5                 tools_4.3.1                 bitops_1.0-7                generics_0.1.3             
[16] stats4_4.3.1                curl_5.2.0                  parallel_4.3.1              tibble_3.2.1                fansi_1.0.6                
[21] pkgconfig_2.0.3             Matrix_1.6-5                S4Vectors_0.40.2            RcppParallel_5.1.7          lifecycle_1.0.4            
[26] GenomeInfoDbData_1.2.11     compiler_4.3.1              stringr_1.5.1               munsell_0.5.0               codetools_0.2-19           
[31] SeuratObject_5.0.1          GenomeInfoDb_1.38.5         RCurl_1.98-1.14             crayon_1.5.2                pillar_1.9.0               
[36] tidyr_1.3.0                 SingleCellExperiment_1.24.0 DelayedArray_0.28.0         StanHeaders_2.32.5          abind_1.4-5                
[41] boot_1.3-28.1               parallelly_1.36.0           rstan_2.32.5                tidyselect_1.2.0            digest_0.6.34              
[46] stringi_1.8.3               future_1.33.1               purrr_1.0.2                 listenv_0.9.0               forcats_1.0.0              
[51] grid_4.3.1                  SparseArray_1.2.3           colorspace_2.1-0            cli_3.6.2                   magrittr_2.0.3             
[56] patchwork_1.2.0             S4Arrays_1.2.0              loo_2.6.0                   pkgbuild_1.4.3              utf8_1.2.4                 
[61] future.apply_1.11.1         withr_3.0.0                 readr_2.1.5                 scales_1.3.0                sp_2.1-2                   
[66] XVector_0.42.0              matrixStats_1.2.0           globals_0.16.2              gridExtra_2.3               progressr_0.14.0           
[71] hms_1.1.3                   GenomicRanges_1.54.1        IRanges_2.36.0              V8_4.4.1                    rstantools_2.3.1.1         
[76] rlang_1.1.3                 Rcpp_1.0.12                 glue_1.7.0                  BiocGenerics_0.48.1         rstudioapi_0.15.0          
[81] jsonlite_1.8.8              R6_2.5.1                    zlibbioc_1.48.0             MatrixGenerics_1.14.0