Spooky segfault from rstan + cmdstanr:::expose_functions

I’m dealing with a really strange segfault, reproducible as follows.

I launch a brand new AWS instance running Ubuntu 22.04. I then run the following exactly as below, answering yes and hitting enter whenever prompted. I segfault on the last line (library(rstan)). I also segfault on the final line if I switch the order of the final two lines. That is, the segfault is triggered whenever I try to simultaneously have rstan loaded and a cmdstan function exposed simultaneously, regardless of the order of operations.

I still see the segfault if remove rstan and StanHeaders and reinstall 2.26 from mc-stan.org or 2.32 from the experimental branch (first installing RcppEigen 3.4 from Hamada’s branch).

The segfault is not triggered if I run library(StanHeaders); it has to be rstan itself.

I don’t see anything terribly suspicious in the onLoad or onAttach for rstan, but I don’t really know what I’m looking for.

Hoping for some pointers as discussed briefly at the Stan meeting today @stevebronder @jonah @rok_cesnovar @bgoodri @hsbadr @andrjohns

sudo apt update -qq
sudo apt install --no-install-recommends software-properties-common dirmngr
sudo apt install -y make
sudo apt install gcc
sudo apt install g++
sudo apt install zlib1g
sudo apt install libz-dev
sudo apt install libblas-dev liblapack-dev
sudo apt install gfortran
sudo apt install libv8-dev
sudo apt install libcurl4-openssl-dev

wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | sudo tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
sudo add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
sudo apt install r-base

R
install.packages("remotes")
install.packages("rstan")
remotes::install_github("stan-dev/cmdstanr")
cmdstanr::check_cmdstan_toolchain()
cmdstanr::install_cmdstan(cores = 4)

generic_stan_function <- 
  "
  real generic_function (real x) {
    return x;
  }
"
stan_file <- cmdstanr::write_stan_file(
  code = paste("functions{", generic_stan_function, "}", sep = "\n")
)
mod <- cmdstanr::cmdstan_model(stan_file)
mod$expose_functions()
library(rstan)
2 Likes

That’s a very tricky kind of thing to debug. Maybe @bgoodri might know.

Is there a reason you need to run both of these things together?

1 Like

Yeah, definitely beyond the limits of my debugging ken.

For my particular case, I can code around this without too much trouble. However, as of a few weeks ago brms users on the development version of brms and the new cmdstanr release will hit this when calling brms::expose_functions() on model fit with the cmdstanr backend. Apparently brms loads enough of rstan to trigger this issue, and as of recently brms calls cmdstanr::expose_functions() when exposing functions from models fit via recent versions of cmdstanr. It seems like it’s probably worth understanding whether there’s a good way to avoid segfaulting with that functionality.

1 Like

Just had a minute to look at this and so far on my machine this does not segfault. What AWS instance are you using? Below is my session info

R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 11 (bullseye)

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.21.8        ggplot2_3.4.2       StanHeaders_2.26.27

loaded via a namespace (and not attached):
 [1] Matrix_1.5-3         gtable_0.3.3         jsonlite_1.8.7      
 [4] compiler_4.3.1       crayon_1.5.2         Rcpp_1.0.11         
 [7] parallel_4.3.1       gridExtra_2.3        callr_3.7.3         
[10] scales_1.2.1         lattice_0.20-45      R6_2.5.1            
[13] generics_0.1.3       distributional_0.3.2 RcppEigen_0.3.3.9.3 
[16] backports_1.4.1      checkmate_2.2.0      tibble_3.2.1        
[19] desc_1.4.2           munsell_0.5.0        rprojroot_2.0.3     
[22] pillar_1.9.0         posterior_1.4.1      rlang_1.1.1         
[25] utf8_1.2.3           inline_0.3.19        RcppParallel_5.1.7  
[28] cli_3.6.1            withr_2.5.0          magrittr_2.0.3      
[31] ps_1.7.5             grid_4.3.1           processx_3.8.2      
[34] cmdstanr_0.6.0.9000  remotes_2.4.2.1      lifecycle_1.0.3     
[37] prettyunits_1.1.1    vctrs_0.6.3          glue_1.6.2          
[40] tensorA_0.36.2       farver_2.1.1         codetools_0.2-19    
[43] stats4_4.3.1         abind_1.4-5          pkgbuild_1.4.2      
[46] fansi_1.0.4          colorspace_2.1-0     matrixStats_1.0.0   
[49] loo_2.6.0            tools_4.3.1          pkgconfig_2.0.3     

3 Likes

Also can you post your session info from that aws instance as well?

Also can you attach an strace to your R session when you run

mod <- cmdstanr::cmdstan_model(stan_file)
mod$expose_functions()
library(rstan)

then put it into a gist and post here? That should help us see which things are being touched

1 Like