I have a model in which the computational bottleneck (AFAICT) is a large matrix multiplication operation, and I’m trying to see if I can get additional speedups from utilising the GPU.
I have CmdStanR 0.1.3 installed, as well as CmdStan 2.25, on MacOS Mojave 10.14.6.
Running clinfo -l
gives:
Platform #0: Apple
+-- Device #0: Intel(R) Core(TM) i7-7920HQ CPU @ 3.10GHz
+-- Device #1: Intel(R) HD Graphics 630
`-- Device #2: AMD Radeon Pro 560 Compute Engine
So, following some previous discussions on the forums, I edited .cmdstanr/cmdstan-2.25.0/make/local
(which was empty) to
STAN_OPENCL=true
OPENCL_DEVICE_ID=2
OPENCL_PLATFORM_ID=0
Than I ran rebuild_cmdstan()
and got:
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes -I stan/lib/stan_math/lib/opencl_2.2.0 -I stan/lib/stan_math/lib/tbb_2019_U8/include -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.7 -I stan/lib/stan_math/lib/boost_1.72.0 -I stan/lib/stan_math/lib/sundials_5.2.0/include -DBOOST_DISABLE_ASSERTS -DSTAN_OPENCL -DOPENCL_DEVICE_ID=2 -DOPENCL_PLATFORM_ID=0 -DCL_HPP_TARGET_OPENCL_VERSION=120 -DCL_HPP_MINIMUM_OPENCL_VERSION=120 -DCL_HPP_ENABLE_EXCEPTIONS -Wno-ignored-attributes -Wl,-L,"/Users/adamhaber/.cmdstanr/cmdstan-2.25.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/adamhaber/.cmdstanr/cmdstan-2.25.0/stan/lib/stan_math/lib/tbb" bin/cmdstan/diagnose.o stan/lib/stan_math/lib/boost_1.72.0/stage/lib/libboost_program_options.a stan/lib/stan_math/lib/boost_1.72.0/stage/lib/libboost_program_options.a -framework OpenCL -o bin/diagnose
Undefined symbols for architecture x86_64:
"tbb::internal::task_scheduler_observer_v3::observe(bool)", referenced from:
stan::math::ad_tape_observer::ad_tape_observer() in diagnose.o
tbb::interface6::task_scheduler_observer::~task_scheduler_observer() in diagnose.o
tbb::interface6::task_scheduler_observer::~task_scheduler_observer() in diagnose.o
tbb::interface6::task_scheduler_observer::~task_scheduler_observer() in diagnose.o
tbb::internal::task_scheduler_observer_v3::~task_scheduler_observer_v3() in diagnose.o
tbb::internal::task_scheduler_observer_v3::~task_scheduler_observer_v3() in diagnose.o
tbb::internal::task_scheduler_observer_v3::~task_scheduler_observer_v3() in diagnose.o
...
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [bin/diagnose] Error 1
make: *** Waiting for unfinished jobs....
Any help would be much appreciated!