Is there a way to have Stan use Apple’s versions of BLAS / LAPACK in its accelerate framework on their Apple silicon machines?
Using BLAS/LAPACK from Eigen discussing using Apple Accelerate on the backend, and Eigen: AccelerateSupport module seem useful.
I tried using macports to install lapacke per the above linked instructions, and then updating make using cmdstanr by modifying what I found in this previous Stan post:
cmdstan_make_local(cpp_options=list(
STAN_THREADS=TRUE, STAN_NO_RANGE_CHECKS=TRUE,
LDLIBS = "-lblas -llapack -llapacke",
CXXFLAGS = "-mcpu=native -DEIGEN_USE_BLAS -DEIGEN_USE_LAPACKE -framework Accelerate /opt/local/lib/lapack/liblapacke.dylib",
CXXFLAGS_OPTIM="-mcpu=native",
CXXFLAGS_OPTIM_TBB="-mcpu=native",
CXXFLAGS_OPTIM_SUNDIALS="-mcpu=native"
),
append=FALSE)
rebuild_cmdstan(cores = 8)
But the build failed with the error:
ld: library not found for -llapacke
clang: error: linker command failed with exit code 1 (use -v to see invocation)
The error is in the below context,
clang++ -E -x c++ ../tbb_2020.3/src/tbbmalloc/mac64-tbbmalloc-export.def -O2 -DUSE_PTHREAD -stdlib=libc++ -arch arm64 -mmacosx-version-min=10.11 -Wall -Wno-unknown-warning-option -Wno-deprecated-copy -mcpu=native -DTBB_SUPPRESS_DEPRECATED_MESSAGES=1 -fno-rtti -fno-exceptions -D__TBBMALLOC_BUILD=1 -Wno-non-virtual-dtor -Wno-dangling-else -I../tbb_2020.3/src -I../tbb_2020.3/src/rml/include -I../tbb_2020.3/include > tbbmalloc.def
clang++ -mcpu=native -DEIGEN_USE_BLAS -DEIGEN_USE_LAPACKE -framework Accelerate /opt/local/lib/lapack/liblapacke.dylib -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes -DSTAN_THREADS -I stan/lib/stan_math/lib/tbb_2020.3/include -mcpu=native -DSTAN_NO_RANGE_CHECKS -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.78.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials -DBOOST_DISABLE_ASSERTS -DSTAN_NO_RANGE_CHECKS -Wl,-L,"/Users/ssp3nc3r/.cmdstan/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/ssp3nc3r/.cmdstan/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb" bin/cmdstan/stansummary.o -lblas -llapack -llapacke -Wl,-L,"/Users/ssp3nc3r/.cmdstan/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/ssp3nc3r/.cmdstan/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb" -o bin/stansummary
clang++ -c -MMD -O2 -DUSE_PTHREAD -stdlib=libc++ -arch arm64 -mmacosx-version-min=10.11 -Wall -Wno-unknown-warning-option -Wno-deprecated-copy -mcpu=native -DTBB_SUPPRESS_DEPRECATED_MESSAGES=1 -Wno-non-virtual-dtor -Wno-dangling-else -fPIC -D__TBBMALLOC_BUILD=1 -I../tbb_2020.3/src -I../tbb_2020.3/src/rml/include -I../tbb_2020.3/include -I../tbb_2020.3/src/tbbmalloc -I../tbb_2020.3/src/tbbmalloc ../tbb_2020.3/src/tbbmalloc/proxy.cpp
ld: library not found for -llapacke
clang: error: linker command failed with exit code 1 (use -v to see invocation)
clang++ -c -MMD -O2 -DUSE_PTHREAD -stdlib=libc++ -arch arm64 -mmacosx-version-min=10.11 -Wall -Wno-unknown-warning-option -Wno-deprecated-copy -mcpu=native -DTBB_SUPPRESS_DEPRECATED_MESSAGES=1 -Wno-non-virtual-dtor -Wno-dangling-else -fPIC -D__TBBMALLOC_BUILD=1 -I../tbb_2020.3/src -I../tbb_2020.3/src/rml/include -I../tbb_2020.3/include -I../tbb_2020.3/src/tbbmalloc -I../tbb_2020.3/src/tbbmalloc ../tbb_2020.3/src/tbbmalloc/tbb_function_replacement.cpp
make: *** [bin/stansummary] Error 1
make: *** Waiting for unfinished jobs....
PS> Using the Apple Accelerate in R provides substantial speedups, which you can activate like so,
cd /Library/Frameworks/R.framework/Resources/lib/
ln -s -i -v libRblas.vecLib.dylib libRblas.dylib