Recommended compiler flags makes RStan model crash

Reading this thread speedup-by-using-external-blas-lapack-with-cmdstan I was reminded, that I wanted to make sure we had the right arguments set for the compiler of our RStan code.
We run a very significant amount of stan models so every little bit counts (more than 10k hours per month currently I estimate)

I set the compiler flags in Makeconf following Configuring-C-Toolchain-for-Linux. That is, I set:

CXX14FLAGS=-O3 -march=native -mtune=native -fPIC"

My finding was that the 8 schools model worked, but on our own models it did not work. Stan and R would crash when sampling started with the error

SAMPLING FOR MODEL '838f06335e6a3b7704453ca29ed6ed1b' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 1.1e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
double free or corruption (out)

Digging quite a bit into it, I realised that it is related to AVX instructions. That is. If I set arch and tune to ‘westmere’, which is the last generation without AVX, our models works. If I additionally added -mavx to enable AVX, they crash.

As shown below I find this problem on R 4.1 and above but not on R 4.0.5 and below. I have only tested on Rstan version 2.21.2 ( GitRev: 2e1f913d3ca3). However, as I change docker between those test, other packages may change.
All tests are on Ubuntu 20.04.3 LTS (Focal Fossa) on an Intel CPU with “Kaby Lake” architecture.

I will greatly appreciate any help in figuring out how to fix this issue.

Reproduce the error
As I cannot share our internal models, I have worked on reproducing the behaviour in a toy-model. These are the steps:
Using Docker open a R Docker from the rocker project:
docker run -it rocker/r-ver:4.1.2 bash
Install V8:

apt-get update
apt-get install libv8-dev -y

Put the compiler flags into Makeconf

echo "CXX14FLAGS=-O3 -march=native -mtune=native"  >>  /usr/local/lib/R/etc/Makeconf 
echo "CXX14=g++" >> /usr/local/lib/R/etc/Makeconf 

Open R and install Rstan


Run the following R code to get the error. This Stan code has a vector matrix multiplication, which is important.


stan_code <- "
data {
matrix[3,3] M;
vector[3] y;
parameters {
  vector[3] beta;
model {
  beta ~ normal(4, 1);
  y ~ normal(M * beta, 1);

dat <- list(M = matrix(c(5,4,8,3,9,1,4,2,6), nrow = 3),
            y=c(2.5, 4.2, 2.))

fit <- stan(model_code=stan_code, data = dat)

Here are my findings:
The model will crash with:
-march=native -mtune=native and with -march=westmere -mtune=westmere -mavx
but will work fine with
-march=westmere -mtune=westmere

The behaviour is observed in rocker/r-ver:4.1 and rocker/r-ver:4.1.2. There is no issue in rocker/r-ver:4.0.4 and rocker/r-ver:4.0.5.

(Ps. I am not repoting this as a bug, as I believe the problem is outside RStan. But I am stuck in figuring out where to look.)


Do you have possibility to check these compiler options with CmdStanR and latest CmdStan?

1 Like

Do you have possibility to check these compiler options with CmdStanR and latest CmdStan?

I have no prior experience with CmdStan or CmdStanR but I can give it a go.

Cheers Anders

Great. Because CmdStan has more recent code and more recent Eigen version, it’s good to check first whether that could affect the error

I can replicate the crash when using the rocker image, but not when using 20.04 locally. I believe this is because the rocker images use pre-built binaries of their R packages, and so the compiler flags used to build the rstan and StanHeaders binaries do not match those of the Stan models, which causes alignment issues.

You can verify this by forcing the packages to install from source before running the stan model:

1 Like

@avehtari I tested with cmdStand and cmdStanR and I did not find any problems. However, I am not completely sure that the compiler flags I set when installing with install_cmdstan, are also the once used when compiling the model?

@andrjohns Installing Rstan and Stanheader from source indeed seems to work. Thanks a lot!

Cheers, Anders

1 Like

Good to know it’s not a version issue