Future rstan is broken with clang++

Is tbbmalloc_proxy harmful anywhere? We may need to have Mac-specific compiler flags to deal with the Catalina thing, but if we could avoid them, that would be simpler.

I’m in the same boat, if we need computers to try stuff on I can bump up as well

I recall that we had minor slow downs on Windows and Ubuntu:

but for MacOS it really makes a difference:

with tbbmalloc:


without:

As there was not a gain on windows+linux we opted against it, but turn it on for macOS by default.

I think we can test without anyone having to inflict Catalina on themselves. What I believe needs to happen is that we need to specify compiler and linker flags such that the system C++ libraries (a.k.a. Xcode’s) are not used at all. According to

https://libcxx.llvm.org/docs/UsingLibcxx.html

something like

clang++ -std=c++11 -stdlib=libc++ -nostdinc++ \
          -I<libcxx-install-prefix>/include/c++/v1 \
          -L<libcxx-install-prefix>/lib \
          -Wl,-rpath,<libcxx-install-prefix>/lib \
          test.cpp

would accomplish that where libcxx-install-prefix is /usr/local/clang[456789]/ under the installer packaged by @coatless . However, I was not able to get the linker to avoid linking to the system’s /usr/lib/libc++.1.dylib in favor of R’s or clang’s even when using the DYLD_LIBRARY_PATH environmental variable. You can check with either of @kevinushey’s scripts on either Mojave or Catalina (although only Catalina seems to cause trouble ATM)

packages <- loadedNamespaces()
result <- lapply(packages, function(package) {
  
  dir <- system.file("libs", package = package)
  libs <- list.files(dir, pattern = "[.]so$", full.names = TRUE)
  if (length(libs) == 0)
    return(FALSE)
  
  output <- system(paste("otool -L", libs, "| grep -s libc++ || true"), intern = TRUE)
  if (length(output) == 0)
    return(FALSE)
  
  writeLines(paste0(package, ":"))
  writeLines(output)
  
})
dlls <- getLoadedDLLs()
paths <- vapply(dlls, `[[`, "path", FUN.VALUE = character(1))
invisible(lapply(paths, function(path) {
  
  if (!file.exists(path))
    return(FALSE)
  
  output <- system(paste("otool -L", shQuote(path), "| grep libc++ || true"), intern = TRUE)
  if (length(output) == 0)
    return(FALSE)
  
  writeLines(paste0(path, ":"))
  writeLines(output)
  
}))
1 Like

Glad to hear you folks are making progress on this!

I’m planning on getting one feature in with 2.19.1 and then tackling this. It will be very helpful to look at what RStan does (especially with regard to compiler flags).

FYI, I just pushed an update which addresses the warning about the virtual destructor thing. That’s actually quite important, as we only call the destructor if the base class which has to percolate that call upwards the inheritance hierarchy in order to ensure that the actual class implementation gets destructed.

I just pushed another update to the for_2.20 branch. This time things work out for the stan_fit_base virtual constructor (which oddly has to be included in the static and dynamic parts).

The changes also include linking against the tbbmalloc_proxy library which speed up the mvn example nicely:

Chain 1: 
Chain 1: Gradient evaluation took 3.8e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 6.9295 seconds (Warm-up)
Chain 1:                0.729468 seconds (Sampling)
Chain 1:                7.65896 seconds (Total)
Chain 1: 

So we go from 11s to 7.7s => that’s an almost 30% speedup just for that. This is why I am so keen about it.

For releasing rstan I would say that we can handle Catalina workarounds in another point release?

I plan to update my machine to Catalina anyways, but the hacks you propose do not sound like they are easy to implement reliably. One would, for example, require the installation of the R clang which we cannot assume to hold.

BTW, I hope you haven’t switched development to develop in the meantime… I saw some commits there if I recall right.

So I just upgraded to Catalina and I first installed R 3.6.1. Next I installed all packages required to run rstan, including Rcpp. I did so using the binary packages. Then I tried the exception Rcpp example you shared, @bgoodri. It did fail as it showed an “unkown exception”. Then I also tried to get rstan to sample the mvn example - it failed with very strange errors. However, I figured that installing everything from source could help here… and it does!

So I reinstalled a fresh R and installed all packages from source instead of using the binary packages. Now the example with the exception works just fine and the mvn_orange example also just samples fine. I am using the vanilla clang++ of the system.

In short - I don’t think that we need to do anything for Catalina. Users just need to install from source their R packages. That’s not so convenient, but likely something which we don’t have in our hands… if I should try another model, let me know (I just took 8schools which works just fine as well)

2 Likes

Can we provide unofficial (non-cran) binary that users could use?

What happens when you have Rcpp, rstan, etc. installed from source but you load a completely unrelated C++ package like readr before running MCMC?

Whow.

I installed readr as binary and then executed the exception example you shared (loading readr installed as binary and then Rcpp installed from source) => and it failed with “unknown reason”.

Then I swapped back the readr to the source compiled version and the exception thing worked again.

Next, I reinstalled readr as binary, then loaded first Rcpp (installed from source) and after that the readr. Then again the exception thing failed.

So it looks as if loading any C++ based package which has not been compiled from source on Catalina breaks the system.

2 Likes

We have an additional problem in that the for_2.20 branch of rstan is not currently installable with StanHeaders 2.19.x on CRAN. I get this error when doing so about a call to a deleted constructor at line 60 of autodiffstackstorage.hpp .

It would be good if we could work around this on the for_2.20 branch of rstan because StanHeaders 2.21.0 is not compatible with rstan 2.19.x on CRAN for deeper reasons. You will have to pull in the most recent changes.

Just to confirm: You want to install the rstan 2.21 into a R installation with StanHeaders 2.19.x … and you expect that to work?

Why would that be necessary? I mean, wouldn’t we add to rstan 2.21 a pre-requisite of a 2.21 StanHeaders such that the above situation would never arise?

Or did you mean something else?

Well, we either have to upload StanHeaders first or rstan first. So, one of them has to pass all the R CMD check with the other as it is on CRAN. In this case, I think it is easier to start with rstan (that has the 2.21 Stan library copied into its src/ directory), but we have one compiler error from Stan Math 2.19.

You gimme a lot of home-work!

The Catalina compile problem is solved.

I just fixed rstan 2.21 so that it does compile and install whenever StanHeaders 2.19 is around. There was a left-over reference to ChainableStack which is handled in the background by stan-math.

However, with the setup of a rstan 2.21 and StanHeaders 2.19 I cannot compile any more a Stan model, because of:

In file included from filefc7f18865831.cpp:743:
/Users/sebi/R/2019-11-01-transient/rstan/include/rstan_next/stan_fit.hpp:7:10: fatal error: 'stan/model/model_base.hpp' file not found
#include <stan/model/model_base.hpp>
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
17 warnings and 10 errors generated.
make: *** [filefc7f18865831.o] Error 1

which makes sense, since this header was introduced in 2.20 of Stan. Not sure how you (or if you) want to address this. If you don’t mind I will add me as author of rstan as of now? Looks like I have to be blamed for things around this if in doubt.

EDIT: I noticed that the latest for_2.20 stops working due to missing symbols related to stan_fit. Turns out that in commit 7bda1eb106920e52c131173952b2a55aa0e3491c the empty destructor for stan_fit got removed from stan_fit.cpp - but we need that declaration since this results otherwise in a missing symbol.

So now everything is working. We can

  • install rstan 2.21 into a R installation with StanHeaders 2.19 - but then you cannot compile any models as explained above
  • install rstan 2.21 into a R installation with StanHeaders 2.21 … and everything works; even on macOS Catalina
2 Likes

Yeah, I have to copy the Stan 2.21 library and algorithms into the rstan sources in order to compile new models, while hopefully compiling old models. I’ll take a look. Thanks.

Sounds good.

BTW, what do you think about the Catalina story?

We could include into the rstan plugin thing for the PKG_LDFLAGS variable the hard-coding of the ABI library as I suggested in the other thread. This should make things work for rstan with XCode’s clang or with clang7 from CRAN. So I think it would make rstan more robust. My suggestion would be to make this only for Catalina.

It does sound good. I would like to hear from a CRAN / Mac expert though. Do you have a link to the thread on R-SIG-Mac? If possible, we should do it for Catalina and any Mac OS post-Catalina.

Assuming your fix is solid, we have to come up with some Makevars logic to do it in packages that come with Stan programs in addition to the rstan plugin.

My message is awaiting approval for posting as I am not signed up to that list. I am curious what they say…I really think this is an Apple clang bug.

I did that. Upon further review, I can patch the two version.hpp files to work around our _REENTRANT and Eigen plugin problems

which would allow a more straightforward upload of StanHeaders 2.21 to CRAN followed by rstan 2.21 without having to worry about making StanHeaders 2.19 and rstan 2.21 work together.

However, none of the reverse dependencies of StanHeaders are ready for TBB, so they throw stuff like

** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘RBesT’ in dyn.load(file, DLLpath = DLLpath, …):
unable to load shared object ‘/tmp/RBesT.Rcheck/00LOCK-RBesT/00new/RBesT/libs/RBesT.so’:
/tmp/RBesT.Rcheck/00LOCK-RBesT/00new/RBesT/libs/RBesT.so: undefined symbol: _ZN3tbb8internal26task_scheduler_observer_v37observeEb
Error: loading failed
Execution halted
ERROR: loading failed

if there is a TBB on the system path or else a compiler error about not finding any <tbb/*.h> due to lack of LinkingTo: RcppParallel.

You mentioned before that you can ifdef things to make the TBB stuff optional for now. Can you push those onto the StanHeaders_2.21 branch of Stan Math

and, if necessary, the StanHeaders_2.21 branch of Stan Library

If you need those branches in a StanHeaders, you can source

https://raw.githubusercontent.com/stan-dev/rstan/for_2.20/StanHeaders/install-github.R

and call it like

install_StanHeaders(rstan_branch = "for_2.20", 
                    math_branch = "StanHeaders_2.21",
                    library_branch = "StanHeaders_2.21")