Problem with Stan modeling extremely slow compilation

Hello, I just upgraded my computer (MacBook Air M1 2020) to a more powerful one (MacBook Air M3 with additional RAM), and started to re-use RStan. To my surprise, Stan is extremely slow, both for starting models and completing fits.

All the models I tried had been tested with Stan on my previous computer and could be run very smoothly, yet they all look completely bugged on my new machine. Besides, I noticed that when the models get stuck, that I interrupt the fit and start coding other things on my R, indications of chain progression continue to pop up in the console even minutes after, and despite the fact that the fit is interrupted and that I am able to run other codes.

In one of my attempted fits, a specific error message accompanied the chains that reached 100 % during this after-break progression . The message read:

"Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal
Calls: <Anonymous> ... doTryCatch -> sendData -> sendData.SOCK0node -> serialize
Execution halted"  

Something is clearly wrong with rstan on my new computer. This could be a problem of dependencies, but I don’t see where it comes from. Any help would be greatly appreciated.

One example model

data {
  int<lower=0> N;
  vector[N] Cd_beans;
  vector[N] Cd_disp;
}
parameters {
  real <lower=0, upper=8> Cd_lim;
  real <lower=0, upper=1.5> Cd_aff;
  real<lower=0, upper=5> sigma_Cd;
}
transformed parameters {
  vector[N] mu;
  for (i in 1:N) {
    mu[i] = Cd_lim * Cd_disp[i] / (Cd_aff + Cd_disp[i]);
  }
}
  model {
    Cd_beans ~ normal(mu, sigma_Cd);
    Cd_lim ~ normal(4, 2);
    Cd_aff ~ normal(0.5, 0.5);
    sigma_Cd ~ lognormal(0.5, 1);
}

Running the example model

Note: the Stan file containing the model presented above is called β€œ240415_MM_raw.stan” and is located in a folder called β€œstan_files” in the working directory

library(rstan)
library(shinystan)

fit_stan <- function(dict_mod, stan_file, stan_fold='stan_files') {
  options(mc.cores=parallel::detectCores())
  path_stan <- paste(stan_fold, paste0(stan_file, '.stan'), sep='/')
  mod <- stan(file=path_stan, data=dict_mod, cores=getOption("mc.cores", 1L))
  dict_stan_res <- dict_mod
  dict_stan_res[['fit']] <- mod
  dict_stan_res[['model']] <- stan_file
  return(dict_stan_res)
}

dict_stan_res_MM <- fit_stan(dict_mod=dict_mod, stan_file="240415_MM_raw", stan_fold='stan_files')

Code for data simulation

I attach here a simple code that allows to create theoretical data that can immediately be used to launch the model. That being said, I am convinced that the problem doesn’t come from the model (it was rolling perfectly on my former computer). The only thing I can see is that I have some missing dependencies, which I don’t find.

create_MM_data <- function(N, L, K) {
  soil_vect <- runif(n=N, min=0, max=1)
  err_vect <- rnorm(n=N, mean=0, sd=1)
  bean_vect <- apply(X=data.frame(soil_vect), FUN=function(x) {L*x/(x+K)}, MARGIN=1) + err_vect
  dict_mod <- list("N"=N, "Cd_disp"=soil_vect, "Cd_beans"=bean_vect)
  return(dict_mod)
}

dict_mod <- create_MM_data(N=456, L=4, K=0.7)

Operating system and R information

My operating system is macOS Sonoma 14.5 (working on a MacBook Air M3), my version of RStan is the 2.32.6.

The output of writeLines(readLines(file.path(Sys.getenv("HOME"), ".R/Makevars"))) is:

Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
  cannot open file '/Users/lasbatsbaptiste/.R/Makevars': No such file or directory

The output of devtools::session_info(β€œrstan”) is:

─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.1 (2024-06-14)
 os       macOS Sonoma 14.5
 system   aarch64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Toronto
 date     2024-10-11
 rstudio  2024.09.0+375 Cranberry Hibiscus (desktop)
 pandoc   3.2 @ /private/var/folders/gg/dbn8ywtd5gg8slf5l5qjtsdh0000gn/T/AppTranslocation/453D0071-7BE9-47E0-99BB-BAC19D25AA55/d/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────
 package        * version    date (UTC) lib source
 abind            1.4-8      2024-09-12 [1] CRAN (R 4.4.1)
 backports        1.5.0      2024-05-23 [1] CRAN (R 4.4.0)
 BH               1.84.0-0   2024-01-10 [1] CRAN (R 4.4.0)
 callr            3.7.6      2024-03-25 [1] CRAN (R 4.4.0)
 checkmate        2.3.2      2024-07-29 [1] CRAN (R 4.4.0)
 cli              3.6.3      2024-06-21 [1] CRAN (R 4.4.0)
 colorspace       2.1-1      2024-07-26 [1] CRAN (R 4.4.0)
 desc             1.4.3      2023-12-10 [1] CRAN (R 4.4.0)
 distributional   0.5.0      2024-09-17 [1] CRAN (R 4.4.1)
 fansi            1.0.6      2023-12-08 [1] CRAN (R 4.4.0)
 farver           2.1.2      2024-05-13 [1] CRAN (R 4.4.0)
 generics         0.1.3      2022-07-05 [1] CRAN (R 4.4.0)
 ggplot2          3.5.1      2024-04-23 [1] CRAN (R 4.4.0)
 glue             1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
 gridExtra        2.3        2017-09-09 [1] CRAN (R 4.4.0)
 gtable           0.3.5      2024-04-22 [1] CRAN (R 4.4.0)
 inline           0.3.19     2021-05-31 [1] CRAN (R 4.4.0)
 isoband          0.2.7      2022-12-20 [1] CRAN (R 4.4.0)
 labeling         0.4.3      2023-08-29 [1] CRAN (R 4.4.0)
 lattice          0.22-6     2024-03-20 [1] CRAN (R 4.4.1)
 lifecycle        1.0.4      2023-11-07 [1] CRAN (R 4.4.0)
 loo              2.8.0      2024-07-03 [1] CRAN (R 4.4.0)
 magrittr         2.0.3      2022-03-30 [1] CRAN (R 4.4.0)
 MASS             7.3-60.2   2024-04-26 [1] CRAN (R 4.4.1)
 Matrix           1.7-0      2024-04-26 [1] CRAN (R 4.4.1)
 matrixStats      1.4.1      2024-09-08 [1] CRAN (R 4.4.1)
 mgcv             1.9-1      2023-12-21 [1] CRAN (R 4.4.1)
 munsell          0.5.1      2024-04-01 [1] CRAN (R 4.4.0)
 nlme             3.1-164    2023-11-27 [1] CRAN (R 4.4.1)
 numDeriv         2016.8-1.1 2019-06-06 [1] CRAN (R 4.4.0)
 pillar           1.9.0      2023-03-22 [1] CRAN (R 4.4.0)
 pkgbuild         1.4.4      2024-03-17 [1] CRAN (R 4.4.0)
 pkgconfig        2.0.3      2019-09-22 [1] CRAN (R 4.4.0)
 posterior        1.6.0      2024-07-03 [1] CRAN (R 4.4.0)
 processx         3.8.4      2024-03-16 [1] CRAN (R 4.4.0)
 ps               1.8.0      2024-09-12 [1] CRAN (R 4.4.1)
 QuickJSR         1.4.0      2024-10-01 [1] CRAN (R 4.4.1)
 R6               2.5.1      2021-08-19 [1] CRAN (R 4.4.0)
 RColorBrewer     1.1-3      2022-04-03 [1] CRAN (R 4.4.0)
 Rcpp             1.0.13     2024-07-17 [1] CRAN (R 4.4.0)
 RcppEigen        0.3.4.0.2  2024-08-24 [1] CRAN (R 4.4.1)
 RcppParallel     5.1.9      2024-08-19 [1] CRAN (R 4.4.1)
 rlang            1.1.4      2024-06-04 [1] CRAN (R 4.4.0)
 rstan          * 2.32.6     2024-03-05 [1] CRAN (R 4.4.0)
 scales           1.3.0      2023-11-28 [1] CRAN (R 4.4.0)
 StanHeaders    * 2.32.10    2024-07-15 [1] CRAN (R 4.4.0)
 tensorA          0.36.2.1   2023-12-13 [1] CRAN (R 4.4.0)
 tibble           3.2.1      2023-03-20 [1] CRAN (R 4.4.0)
 utf8             1.2.4      2023-10-22 [1] CRAN (R 4.4.0)
 vctrs            0.6.5      2023-12-01 [1] CRAN (R 4.4.0)
 viridisLite      0.4.2      2023-05-02 [1] CRAN (R 4.4.0)
 withr            3.0.1      2024-07-31 [1] CRAN (R 4.4.0)

 [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library

This problem is quite blocking for me, looking forward for any help.

Thanks!

I don’t know the answer, but I did just release this from the spam filter.

I will go in and escape all your code so it’s readable.

I should have said in my first reply that I would strongly recommend trying cmdstanr. It’s not on CRAN, but it’s much easier to install and get running correctly because it runs Stan from CmdStan, which doesn’t need to coordinate with R.

We also anticipate that it will keep up with Stan more closely than RStan due to the difficulties imposed by CRAN.

Otherwise, I would suggest starting by uninstalling Xcode and reinstalling.

This indicates that your R isn’t installed in the usual place, or it’s installed with permissions that can’t be accessed through R. I believe you can tell it where to find your R install, but I’m not 100% sure.

1 Like

Thank you for the quick reply. So if I sum up, you would recommend to :

  1. try using stan by installing the CmdStanR library rather than rstan

OR:

  1. uninstalling Xcode, but I am not sure what it is. As a precision, I already tried to uninstall and reinstall rstan when I first noticed the problem. It didn’t work, so I also uninstalled/reinstalled R and RStudio, without results either.

  2. Try to solve the location issue of my R. I easily find R in Applications when I go to the Finder, but I am not sure of what the exact address is. I just quickly scouted for it in Rstudio, but I didn’t find the path to the Applications folder on my home (probably a silly problem)

Did I get this right? I’m not extremely comfortable with this kind of procedures.

Yes. I’d go with (1) if you can, but it sounds like you’re not comfortable working in the terminal, which may make that hard.

For (2), Xcode is the official Apple C++ compiler. It’s usually managed as an application.

For (3), installs go somewhere in your file system. The error you’re getting is saying that the program is looking for the usual location where people put R, but they’re not finding it there. So it looks like you didn’t install R itself in the usual place.

I’m very sorry it’s so painful. We work very hard to make it as easy as we can, but there’s a ridiculously long tail of people struggling with installs as you see on our list. The problem is that we’re wrestling with CRAN, Mac OS X, and Windows, and C++, none of which make this easy when they auto-update and pull the rug out from under us.

Good morning. I am coming back on the rstan functioning problem we had discussed last week. Prior to use Cmdstan, I would like to see if I can’t get rstan to work the usual way, it still looks the easiest way to me.

After restarting the computer, stan looks to be a little faster than it was on Friday. That being said, problems are still occurring : 1) R requires more time to compile and fit models that were working very smoothly on a less powerful computer 2) complex models that perfectly converged on my previous computer no longer do now with the same data (Rhat at 2 and all possible error messages coming out, diagnosing with shinystan shows that the chains don’t merge correctly) 3) I still have the strange problem that when I interrupt a fit, the model first seems to stop, but notifications of chain progression continue to appear until 100% is reached.

If this can help on point 3), there is one error message that appears when the chains reach 100 % (after the model is interrupted). It reads :

Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal
Calls: … doTryCatch β†’ sendData β†’ sendData.SOCK0node β†’ serialize
Execution halted

I tried some of your suggestions to solve the problem.

Regarding Xcode, it turns out that I don’t have it installed on my computer. I see it in the Appstore, but it doesn’t appear anywhere in my apps. Do you think installing it may help ?

I also tried to correct the problem of R not finding the right folder. I followed a tutorial to set up an R folder on my computer, and entered the following lines in a terminal :

mkdir ~/.R
nano ~/.R/Makevars

And then configuring the new folder with :

CXX14FLAGS=-O3 -march=native -fopenmp
CXX14=g++ -fopenmp

When I check R’s location with :

writeLines(readLines(file.path(Sys.getenv(β€œHOME”), β€œ.R/Makevars”)))

I no longer get an error message but instead receive :

CXX14FLAGS=-O3 -march=native -fopenmp
CXX14=g++ -fopenmp

but does that mean R and all its dependencies are now present at the right location ? On my other computer, I had an R folder that appears on the left bar of my Finder, and I don’t see it now. Furthermore, I noticed that creating the R folder actually slowed the models. Before to create the folder, the chains of the most complex model I checked took between 125 and 200 seconds to complete, when I tried the fit after, the time increased to 240 and 340s, seen on all the later attempts.

Apart from the Xcode and folder things, could the problem be due to missing dependencies, or a question of R updates ? I am on R 4.4.1 on my new computer, I think I had an older version on the previous one. I guess you would have spotted it quickly if the problem came from here, but could it be that too ?

And if we don’t find any solution here, do you know any other resources or persons I could ask to solve the issue ?

Thank you for your answer.

It seems like your issue might be on the installation side. However, you may also want to consider:
Even faster matrix math in R on macOS with M1 | Mikhail Popov once you have things sorted.

Not quite, but it does indicate that R’s installed in the default location that our scripts expect.

Yes, both of those things can matter.

Stan relies on a C++ toolchain in order to compile models. All the makevars stuff is configuring the C++ compiler. Usually on Mac OS X, people use the Xcode C++ compiler as that’s the one that Apple distributes and maintains.

If you go into a terminal and type the following, what do you see?

$ clang++ --version

When I run it on my Mac, I see this:

~$ clang++ --version
Apple clang version 16.0.0 (clang-1600.0.26.3)
Target: x86_64-apple-darwin23.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

You need to be careful when installing Xcode to make sure to install the command-line tools.

RStan further imposes a dependency between the C++ compiler and R that is not present in cmdstanr. So you have to make sure they 're using compatible versions of everything. The easiest way to do this is to usually just start from scratch and install the latest version of everything.

Will make sure to check the link when Stan works correctly, thanks for the advice.

When I go in a terminal and type:

clang++ --version

I get:

Apple clang version 15.0.0 (clang-1500.3.9.4)

Target: arm64-apple-darwin23.5.0

Thread model: posix

InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Looks to be 2 things here. 1) It seems to confirm that I don’t have Xcode, it doesn’t appear in InstalledDir 2) my version is more ancient than yours, v15.0 instead of v16.0.

Should I check for the Xcode installation first, with the latest version and see if it solves the problem? Is there anything special to do in order to install the command-line tools as well?

This looks like you do have Xcode installed. That’s what’s implementing clang++ unless you installed your own via homebrew (and even then, I’m pretty sure they report versions differently).

I would update to the latest Clang by installing the latest Xcode. Then after you do that, open a terminal and run this to make sure the command-line tools are installed.

$ xcode-select --install 

You may have to run as super-user,

$ sudo xcode-select --install

depending on how you have things set up.

There are some very verbose instructions here and probably more instructions in our install instructions for Mac OS X.

I installed the Xcode app, then wrote xcode-select --install in a terminal and received a message telling me that command line tools were already installed. I updated them to the latest version:

clang++ --version

Apple clang version 16.0.0 (clang-1600.0.26.3)
Target: arm64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

For what I see, it quite improved the situation. The fits look faster, even though the models take an unexpected 15-20s delay to compile, which I hadn’t on my previous computer. Furthermore, I still have the weird issue that when I interrupt a fit, notifications of chain progression continue to pop up until the 100%, but it doesn’t prevent to use R in the mean time.

This still makes me suspicious about the installation, there is still a problem somewhere, but Stan looks to be working anyway, so I guess we could think of stopping there, and I will compare the performance with Cmdstanr.

Just, you already helped me quite a lot, but would you agree to perform a quick test? I am still surprised that a model that perfectly worked on my previous computer is no longer converging on my new. Would you mind to try running it on your side (the fit is really short, about 3-5 minutes max) to see if the lack of convergence is a problem from my computer or from the model? I don’t have another computer from which to test it at the moment, if I sent you the model and a script, would you find a moment to run it and tell me if it converges for you? I don’t know if it’s something that is usually asked on this forum.

Anyway, thank you for the help, if you have any other idea about the origin of the problems (compilation and notifications after interruption) that I still encounter, they are welcome, but this is already good.

Compiling a model for the first time always takes about that long. It should only do it once if you’re using most of our interfaces.

I don’t know what’s going on with the R interfaceβ€”you can open a new thread on that and ping @bgoodri. It probably has to do with needing kill signals to whatever project’s running the parallelization. I don’t think that’s a problem with your install.

Sure: bcarpenter@flatironinstitute.org

I’m happy to run something if it works, but I’m really not our install specialist!