GNU C++ compiler and Intel MKL (FYI)

performance

#1

The Intel MKL is free to use and download. Installed it into my home folder.
Intel provides linking recommendations for using with GNU C++.

To make it work I added the following into the file make/local of CMDSTAN.

MKLROOT = /home/andre/intel/mkl
CXXFLAGS += -DEIGEN_USE_MKL_ALL  -m64 -I${MKLROOT}/include
CXX14FLAGS  += -DEIGEN_USE_MKL_ALL  -m64 -I${MKLROOT}/include
LDLIBS += -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_lp64.a ${MKLROOT}/lib/intel64/libmkl_sequential.a ${MKLROOT}/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm -ldl

If somebody likes to do the same. This worked on my system.


#2

Nice! Dirk Eddelbuettel also has a nice blog post showing how to add the Intel MKL via apt

http://dirk.eddelbuettel.com/blog/2018/04/15/


#3

If ones install MKL globally, then I’d probably go for dynamic linking. One has to choose slightly different options. The good thing is, Intel developed a linking advisor, which does the most work.
If I not totally wrong there is also a TBB version of MKL. I noticed a speedup of around 15% on my old Sandybridge system.


#4

Do you have any benchmarks for whether this notably improves Stan performance?

And is this possible to use with rstan?


#5

That depends about how vectorized your models are, the CPU architecture, whether you choose MKL single threaded or the MKL/TBB version. On the other hand, we get GPU into Stan soon and this is mostly focused
on what MKL/TBB covers.
If you have a matrix multiplication and a Poisson likelihood, then the benefits are good. If you change that through a Bessel/not covered for-loop based likelihood, the time you model spend is mostly spent in this. Then the benefits are relatively small.
I don’t have CPU capacity left to make intensively tests.