Potential to statically link TBB

This is something @stevebronder and I recently discussed on Slack.

TBB and the nature dynamic linking tend to cause a variety of problems for people interested in shipping Stan models around, and in general can be a pain on Windows. TBB does not recommend, but does allow, static linking (details can be found in $TBB_PATH/build/big_iron.inc.)

I’m wondering if we would consider supporting this in CmdStan, or if there are any hard and fast reasons we could not. The reason (as I understand it) that TBB strictly opposes this is due to the fact that having non-dynamic instances of TBB around means that multiple programs running on the same machine aren’t able to effectively split up resources. I believe Stan is already causing this problem, since we vendor our own TBB and set up the RPATH of our executables such that it always picks up the TBB from the CmdStan it compiled with.

For example, I have two versions of cmdstan currently downloaded. If I inspect the bernoulli example from both, I see:

[brian@FlatTop cmdstan on develop] $ ldd ./examples/bernoulli/bernoulli 
	libtbb.so.2 => /home/brian/Dev/cpp/cmdstan/stan/lib/stan_math/lib/tbb/libtbb.so.2 (0x00007fd4cfb01000)
[brian@FlatTop cmdstan on develop] $ ldd ../cmdstan-ztest/examples/bernoulli/bernoulli 
	libtbb.so.2 => /home/brian/Dev/cpp/cmdstan-ztest/stan/lib/stan_math/lib/tbb/libtbb.so.2 (0x00007fa564c2c000)

Note the -ztest on the second line - these are two different libtbbs! The only advantage our current setup has over full static linking is when multiple models are compiled with the exact same cmdstan installation. For packages like Prophet which end up repackaging the TBB shared library, there is no advantage over static linking.

I was able to compile a fully statically linked Stan model using $TBB/build/big_iron.inc and some manual edits. This model samples fine before segfaulting at exit:

[brian@FlatTop bernoulli on develop [?]] $ ldd bernoulli 
	not a dynamic executable
[brian@FlatTop bernoulli on develop [?]] $ file bernoulli 
bernoulli: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=16e9f79d05e568118708ca35c841c8031248e952, for GNU/Linux 3.2.0, with debug_info, not stripped

I think providing this as an option, even if we also recommend against it, will be tremendously helpful for downstream packages like Prophet. I’d love to hear from other developers @wds15 @rok_cesnovar @andrjohns @stevebronder

The Intel devs very strongly recommend against static linking and say many times in their doc that it is not supported. There are (almost hidden) toggles in their makefiles which allow for it, yes…but since it is not supported at all from the TBB, I always opted against making this a possibility in Stan. I don’t think that the Stan project would like to provide such an option given that the TBB is not supporting it. So that leaves the option to make this feature available in the same mysterious way as the TBB folks did is what I would suggest here. Thus, we should not block statically linking, but we should not declare this as something we support. So far I have seen multiple times issues with dynamic linking the TBB, but usually it was possible to resolve this.

However, as it stands, the forthcoming rstan 2.26 build by @hsbadr uses a statically linked TBB if I recall right. It seems to work, maybe he can comment more. It’s just that I personally would stay away from things not recommended at all by a vendor…

I think having static linking as an option that can be used by some user/ for some use cases is probably reasonable.

We don’t want to make it the default option, I think.

We certainly don’t want to make it the default, but it would be nice as an option. I was recently helping a Windows user who had a version of TBB (build with MSVC) in %SYSTEM32%. They didn’t know how it got there, but Windows’ dynamic linking priorities were putting that above anything from CmdStan. As a result, they couldn’t run any models unless they were willing to delete that file, which was obviously a big ask.

For things like Prophet, they end up repackaging a version of TBB called things like libtbb-716bd13b.so.2, and setting RPATHs such that their model will only use this version. I cannot see how this is not equivalent to static linking with extra steps, really.

1 Like