Intermittent segfaults with map_rect and TBB

Hi,

Not really sure if this is a bug or just a configuration error on my part:
I’m having issues with intermittent segfaults when using map_rect with TBB
(cmdstan v2.21.0 github branch, but I also tested it on the v2.21.0 .tar.gz archive).
The segfaults only happen in about 5% of the model runs.

To reproduce, compile the attached stan model (from the stan manual) and run it with the
attached data (though it does not seem to depend on the model or the data).
In about 2-5% of the runs, I get an output like:

Elapsed Time: 0.482551 seconds (Warm-up)
0.6037 seconds (Sampling)
1.08625 seconds (Total)

[2] 88862 segmentation fault /home/hbr/sync/test/map_rect_model sample data

Sometimes I also get: double free or corruption (fasttop) or corrupted double-linked list or
malloc_consolidate(): invalid chunk size

To rule out faulty memory I also tested it on a second machine with the same results.
Also tested it on v2.20 and no segfaults there.

Contents of my make/local:

CXXFLAGS += -DSTAN_THREADS

g++ -v

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-solus-linux/9/lto-wrapper
Target: x86_64-solus-linux
Configured with: …/configure --prefix=/usr --with-pkgversion=Solus --libdir=/usr/lib64 --libexecdir=/usr/lib64 --with-system-zlib --enable-shared --enable-threads=posix --enable-gnu-indirect-function --enable-__cxa_atexit --enable-plugin --enable-gold --enable-ld=default --enable-clocale=gnu --enable-multilib --with-multilib-list=m32,m64 --enable-lto --with-gcc-major-version-only --with-bugurl=https://dev.getsol.us/ --with-arch_32=i686 --enable-linker-build-id --with-linker-hash-style=gnu --with-gnu-ld --build=x86_64-solus-linux --target=x86_64-solus-linux --enable-languages=c,c++,fortran
Thread model: posix
gcc version 9.2.0 (Solus)

Can someone reproduce this or has an idea how to troubleshoot?
I’m happy to provide more information if required and open an issue on github if this is indeed a bug (please tell me which repo).

Thank you very much in advance

map_rect_model.stan (511 Bytes)
data.R (285 Bytes)

Hi!

I am seeing this as well… to confirm: This happens after the model successfully runs and all the results are written to disk correctly even if the segfault happens. Right?

Could you file an issue for this on Stan-math? Thanks.

Sebastian

1 Like

Yes, this happens after the model runs and the results are written to disk. Just filed the issue: https://github.com/stan-dev/math/issues/1637

Thanks!

2 Likes