Speedup by using external BLAS/LAPACK with CmdStan and CmdStanR/Py

With the Github development version, cmdstanr will warn you that you had flags in your make/local and you probably should copy them from your previous CmdStan install. But that obviously only works if you had added them at some point.

I am not sure on this one. I specifically remember a ton of rstan issues that were caused by march and mtune flags. Because rstan install instruction had those included and everyone just copy-pasted that. So I am not sure about being the default option, they should be more prominently suggested somewhere ( I don’t currently have a great idea where).

1 Like

Still, the information shouldn’t be hidden if the speed differences are that big. I suggested that install_cmdstan() could ask whether to use march/mtune, with information that using them can double the speed / drop the time by 50%, but if they have any issues they can reinstall and when asked for that option choose “safe mode”.

Can you tell more why Windows has problems with march?

reduce_sum is not a good comparison. That was adertised a lot in certain release, and brms uses it by default. No-one had to do any implementation work for march/mtune, it’s been available for a long time, getting all the time more important as CPUs get more new instruction sets, and itäs not mentioned anywhere in CmdStan installation.

1 Like

I have updated the first post and the post with the timing results to reflect use of -march=native -mtune=native

3 Likes

Specifically for CmdStanR, install_cmdstan() can easily be interactive and ask. And the CmdStan User Guide Instalaltion section should definitely discuss possible options.

1 Like

Yeah, I thought about an interactive version before, but had a worry it would cause issues with CI scripts. Made an issue: Interactive installation · Issue #605 · stan-dev/cmdstanr · GitHub

I only added some basic documentation on STAN_CPP_OPTIMS here: 3 Compiling a Stan Program | CmdStan User’s Guide
Mostly because come release time, I noticed there were zero docs on it. Not an expert on them so its definitely lacking more content.

The speedup number there is info I got from Steve on the PR or via e-mail. I did not do extensive research due to a lack of time at release time.

There are many Discourse posts where the solution was removing mtune=native. For example: R session Aborted with changing prior - #3 by dhunfini
There are many more if you search “remove mtune=native” (Search results for 'remove mtune=native' - The Stan Forums - not all are about issues with mtune=native, there are quite a few though).

That issue on Windows might be gone, but no one really knows. There has been positive feedback wrt to ease of install of cmdstanr/py vs rstan and I don’t want to break that :)

5 Likes

I’ve also heard it said that on arm architectures we should be using mcpu rather than march or mtune. I don’t know a thing about this stuff, just passing along the claim:

3 Likes

Case in point, I never knew about this 🤷

1 Like

Unfortunately the use of -march=native -mtune=native is unstable on Windows since the Mingw GCC has issues with generating AVX and AVX2 instructions. This is why the rstan instructions use flags to explicitly enable SSE without AVX

1 Like

Related to -march, -mtune, -mcpu

What I found

  • recent gcc x86 and Arm:
    • -march=X: Tells the compiler that X is the minimal architecture the binary must run on. The compiler is free to use architecture-specific instructions. This option behaves differently on Arm and x86. On Arm, -march does not override -mtune, but on x86 -march does override both -mtune and -mcpu.
    • -mtune=X: Tells the compiler to optimize for microarchitecture X, but does not allow the compiler to change the ABI or make assumptions about available instructions. This option has the more-or-less the same meaning on Arm and x86.
    • -mcpu=X: On Arm, this option is a combination of -march and -mtune. It simultaneously specifies the target architecture and optimizes for a given microarchitecture. On x86, this option is a deprecated synonym for -mtune.
  • clang: starting from 12.0 -mtune works the same as in gcc
  • So based on the documentation of the recent compilers, -mtune=native should not use features that are not available, but…
  • … older compilers may have different behavior or may not recognize the specific CPU details and then using -mtune=native may fail, but -mtune=generic is likely to work. If only -march=native has defined, it usually implies -mtune=native, except when the compiler doesn’t recognize all the details, and switches to -mtune=generic, which seems to have happened in those Windows cases where just dropping -mtune=native did help.

So it’s likely that there would be less problems with newer compilers, but an interactive installation would allowask, e.g., options

Optimization:

  1. Safe: Should work with all compilers and CPUs. [Default. Recommended for MINGW on Windows or if 2. doesn’t work]
  2. Fast: 0-100% faster computation using CPU specific instruction sets, specially in case of bigger matrix operations, but the compilation may fail for some compiler-OS-CPU combinations. (CXXFLAGS += -march=native) [Recommended if 3. doesn’t work]
  3. Faster: 0-100% faster computation using CPU specific instruction sets and CPU specific optimization, specially in case of bigger matrix operations, but the compilation may fail for some compiler-OS-CPU combinations. (CXXFLAGS += -march=native -mtune=native) [Recommended for GCC on Linux]

Threads:

  1. Single thread: If you are not using reduce_sum or …? [Default]
  2. Multithread: If you are using reduce_sum or …? (not needed for external BLAS/LAPACK multithreading]

BLAS/LAPACK:

  1. Eigen internal: No need to install other packages. [Default. Recommended for most users.]
  2. External BLAS/LAPACK: Possibly slightly faster single thread computation than with Eigen, and possibility to use multithreaded matrix operations by using external BLAS/LAPACK such as OpenBLAS or Intel MKL (CXXFLAGS += -DEIGEN_USE_BLAS -DEIGEN_USE_LAPACKE) [Recommended only for advanced users in case of slow computation dominated by big matrix operations]

EDIT: Minor edit + added this also to the issue Interactive installation · Issue #605 · stan-dev/cmdstanr · GitHub
EDIT2: fixed reduce_sum

6 Likes

It‘s reduce_sum! Not sum_reduce.

Great summary!

1 Like

@Bob_Carpenter asked whether the above mentioned flags need to defined when building CmdStan, or is it sufficient to give them only when compiling the model. I tested this, and it is sufficient to define them at the model compilation time. It is not yet completely clear whether there could be speedup, e.g., in $cmdstan_summary() if those options would be used already when building CmdStan.

The current CmdStanR documentation did not have an example how to add flags with += in the compile method, and it seems it is not possible at the moment, so I made an issue, This would allow more flexible use of different options and faster testing of effect of different options.

EDIT: updated += info

2 Likes

I think setting march or mtune on Mac requires rebuilding CmdStan if you use precompiled headers. At least this is what I saw on CI.

It seems to not require recompiling on Linux/Windows - I think in that case the precompiled headers automatically get rebuilt.

Oh dear… I just had to find out that a make/local which looks like this:

CXXFLAGS+=-mtune=native -march=native
CFLAGS+=-mtune=native -march=native

is not enough to turn on these (important) optimizations in Stan in all parts of the code. The makefiles have been refactored a while ago (I think this one is due to @stevebronder ? and maybe I reviewed this change…) such that optimization settings like these are not being propagated to the Sundials and the TBB libraries. So in order to get these two libraries also to use these optimizations, one has to provide a make/local as

CXXFLAGS+=-mtune=native -march=native
CFLAGS+=-mtune=native -march=native
CXXFLAGS_OPTIM_SUNDIALS+=-mtune=native -march=native
CPPFLAGS_OPTIM_SUNDIALS+=-mtune=native -march=native
CXXFLAGS_OPTIM_TBB+=-mtune=native -march=native

I have not tested the speed impact for ODEs, but I am almost sure that the effect is quite big.

This raises the question if we can do better here or at least make sure that our templates are good in pointing out these key options? Maybe we should also simplify setting this easily on for Stan programs and its libraries easily? Suggestions (tagging @rok_cesnovar )?

Addon:

A quick workaround for users to ensure that the optimization are used throughout is to use this make/local:

CC=clang -mtune=native -march=native
CXX=clang++ -mtune=native -march=native

…not ideal…but seems to work…

4 Likes

I just noticed that using this option makes the model compilation time to double (at least for a simple non-centered 8 schools model). In my Linux laptop with gcc, the model compilation time went from 9s to 19s. The additional -mtune=native didn’t have effect on the compilation time.

2 Likes

Yeah I think we need to sit down and look these over. I personally use O=3 -march=native -mtune=native in my make/local file. Is the behavior we want that anything added to CXXFLAGS gets added to like a CXXFLAGS_SUNDIALS? Or do we want CXXFLAGS_SUNDIALS to be independent of CXXFLAGS? imo I’d rather have them seperate and the user can always put CXXFLAGS_SUNDIALS=$(CXXFLAGS) in their make/local

Yes it will for sure increase compile time a lot, especially for newer machines that can execute multiple sizes of SIMD operations. For computers that support wide instruction sets like AVX512 the compiler will often write code that checks whether theres X bytes left to work on in a loop and then dish out to a vectorized instruction set. Ex many iterations of the loop can use AVX512 instructions (8 doubles at a time), but near the end of a loop the compiler may only be able to use AVX2 instruction sets (4 doubles at a time). So writing all that out and figuring out when it’s worth using these loops motions takes a lot more time for the compiler. I believe having native assembly support also affects a bunch of the compiler optimizations as well.

3 Likes

Sorry, @wds15, but I don’t understand where I’m supposed to put STAN_CPP_OPTIMS=true or where I might indicate the CXXFLAGS. I know how to do this for CmdStan, just not through cmdstanr.

Oops, the answer’s right in the top post. Is this mirrored in the cmdstanr doc somewhere? I couldn’t find it looking there and an internet search took me here.

Not going well for me. I tried to guess whether it’d want O=3 or -O3 and guessed wrong and now it won’t build anything for me. Is there a way to reset and try again?

I’d say I’m pretty hooked into Stan, but I can’t find the optimizations.

For now, I just tried using = instead of += to see if I could get going again and I see that many of the stages of compilation are not using the -O3 or arch or tune directives, or both. Sundials gets the -O3, but not tbb. Neither get the tune or arch directives. As far as I can tell, the only things getting the directive are stansummary, print, diagnose, and main. I don’t see the -march or -mtune anywhere else.

This would be less frustrating if there was doc somewhere I could trust (if there is, please let me know where, because I spent quite a bit of time looking and can’t find it).

Here’s the whole dump of what happened when I tried just setting the flags (no idea why it keeps showing those +=, but I think this stuff all persists on my machine somehow. Is there a file I can edit to just get rid of the old junk?

> cpp_options = list("CXXFLAGS = -march=native -mtune=native -O3")
> cmdstanr::cmdstan_make_local(cpp_options = cpp_options, append = TRUE)
[1] "CXXFLAGS += -march=native -mtune=native O=3"
[2] "CXXFLAGS += -march=native -mtune=native -O3"
[3] "CXXFLAGS += -march=native -mtune=native -O3"
[4] "CXXFLAGS = -march=native -mtune=native -O3" 
> cmdstanr::rebuild_cmdstan(cores = 4)
rm -f -r test
rm -f 
rm -f 
rm -f 
rm -f 
  removing dependency files
rm -f    
rm -f   
rm -f   
  cleaning sundials targets
rm -f stan/lib/stan_math/lib/sundials_5.7.0/src/nvector/serial/nvector_serial.o
  cleaning Intel TBB targets
rm -f -rf stan/lib/stan_math/lib/tbb
rm -f bin/stanc bin/stansummary bin/print bin/diagnose
rm -f -r src/cmdstan/main*.o bin/cmdstan
rm -f 
rm -f examples/bernoulli/bernoulli examples/bernoulli/bernoulli.o examples/bernoulli/bernoulli.d examples/bernoulli/bernoulli.hpp
rm -f -r 
cp bin/mac-stanc bin/stanc
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -fvisibility=hidden -o bin/cmdstan/stansummary.o src/cmdstan/stansummary.cpp
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -fvisibility=hidden -o bin/cmdstan/print.o src/cmdstan/print.cpp
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -fvisibility=hidden -o bin/cmdstan/diagnose.o src/cmdstan/diagnose.cpp
chmod +x bin/stanc
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/nvector/serial/nvector_serial.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/nvector/serial/nvector_serial.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_math.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_math.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodea.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodea.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodea_io.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodea_io.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_bandpre.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_bandpre.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_bbdpre.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_bbdpre.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_diag.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_diag.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_direct.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_direct.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_io.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_io.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_ls.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_ls.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls_sim.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls_sim.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls_stg.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls_stg.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls_stg1.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_nls_stg1.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_spils.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/cvodes/cvodes_spils.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_band.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_band.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_dense.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_dense.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_direct.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_direct.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_futils.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_futils.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_iterative.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_iterative.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_linearsolver.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_linearsolver.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_matrix.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_matrix.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_memory.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_memory.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_nonlinearsolver.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_nonlinearsolver.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_nvector.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_nvector.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_nvector_senswrapper.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_nvector_senswrapper.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_version.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_version.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sunmatrix/band/sunmatrix_band.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sunmatrix/band/sunmatrix_band.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sunmatrix/dense/sunmatrix_dense.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sunmatrix/dense/sunmatrix_dense.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sunlinsol/band/sunlinsol_band.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sunlinsol/band/sunlinsol_band.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sunlinsol/dense/sunlinsol_dense.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sunlinsol/dense/sunlinsol_dense.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sunnonlinsol/newton/sunnonlinsol_newton.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sunnonlinsol/newton/sunnonlinsol_newton.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/sunnonlinsol/fixedpoint/sunnonlinsol_fixedpoint.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/sunnonlinsol/fixedpoint/sunnonlinsol_fixedpoint.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idaa.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idaa.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idaa_io.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idaa_io.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_bbdpre.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_bbdpre.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_direct.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_direct.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_ic.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_ic.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_io.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_io.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_ls.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_ls.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_nls.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_nls.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_nls_sim.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_nls_sim.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_nls_stg.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_nls_stg.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_spils.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/idas/idas_spils.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_bbdpre.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_bbdpre.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_direct.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_direct.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_io.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_io.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_ls.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_ls.o
clang++ -pipe   -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT  -O3 -I stan/lib/stan_math/lib/sundials_5.7.0/include -DNO_FPRINTF_OUTPUT     -O3  -c -x c -include stan/lib/stan_math/lib/sundials_5.7.0/include/stan_sundials_printf_override.hpp stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_spils.c -o stan/lib/stan_math/lib/sundials_5.7.0/src/kinsol/kinsol_spils.o
touch stan/lib/stan_math/lib/tbb/tbb-make-check

--- Compiling the main object file. This might take up to a minute. ---
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -o src/cmdstan/main.o src/cmdstan/main.cpp

--- Compiling pre-compiled header. This might take a few seconds. ---
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c stan/src/stan/model/model_header.hpp -o stan/src/stan/model/model_header.hpp.gch
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS               -Wl,-L,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"      bin/cmdstan/print.o        -Wl,-L,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"   -o bin/print
clang++ -march=native -mtune=native -O3 -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS               -Wl,-L,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"      bin/cmdstan/diagnose.o        -Wl,-L,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/bcarpenter/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"   -o bin/diagnose
ar -rs stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_5.7.0/src/nvector/serial/nvector_serial.o stan/lib/stan_math/lib/sundials_5.7.0/src/sundials/sundials_math.o

...

clang++ -c -MMD -O2 -DUSE_PTHREAD -DDO_ITT_NOTIFY -stdlib=libc++ -m64 -mrtm -mmacosx-version-min=10.11   -Wno-unknown-warning-option -Wno-deprecated-copy   -DTBB_SUPPRESS_DEPRECATED_MESSAGES=1 -fno-rtti -fno-exceptions -D__TBBMALLOC_BUILD=1 -Wno-non-virtual-dtor -Wno-dangling-else -fPIC  -I../tbb_2020.3/src -I../tbb_2020.3/src/rml/include -I../tbb_2020.3/include -I../tbb_2020.3/src/tbbmalloc -I../tbb_2020.3/src/tbbmalloc ../tbb_2020.3/src/tbbmalloc/large_objects.cpp


--- CmdStan v2.27.0 built ---
>