Cmdstan gpu support

I nstalled cmdstan-2.20.0 from tarball and modified make/local as in However make build gives me this message. Please advise.

g++ -std=c++1y -pthread -Wno-sign-compare -I stan/lib/stan_math/lib/opencl_1.2.8 -O3 -I src -I stan/src -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.3 -I stan/lib/stan_math/lib/boost_1.69.0 -I stan/lib/stan_math/lib/sundials_4.1.0/include -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -DSTAN_OPENCL -DOPENCL_DEVICE_ID= -DOPENCL_PLATFORM_ID= -DCL_USE_DEPRECATED_OPENCL_1_2_APIS -D__CL_ENABLE_EXCEPTIONS -Wno-ignored-attributes -c -o bin/cmdstan/stansummary.o src/cmdstan/stansummary.cpp
In file included from stan/lib/stan_math/stan/math/prim/mat/fun/mdivide_left_tri.hpp:10:0,
from stan/lib/stan_math/stan/math/prim/mat/fun/mdivide_left_tri_low.hpp:6,
from stan/lib/stan_math/stan/math/prim/mat/fun/chol2inv.hpp:7,
from stan/lib/stan_math/stan/math/prim/mat.hpp:68,
from stan/src/stan/mcmc/chains.hpp:5,
from src/cmdstan/stansummary.cpp:5:
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp: In constructor ‘stan::math::opencl_context_base::opencl_context_base()’:
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp:104:30: error: expected primary-expression before ‘>=’ token
if (OPENCL_PLATFORM_ID >= platforms_.size()) {
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp:108:48: error: expected primary-expression before ‘]’ token
platform_ = platforms_[OPENCL_PLATFORM_ID];
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp:115:28: error: expected primary-expression before ‘>=’ token
if (OPENCL_DEVICE_ID >= devices_.size()) {
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp:119:42: error: expected primary-expression before ‘]’ token
device_ = devices_[OPENCL_DEVICE_ID];
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp: In member function ‘std::__cxx11::string stan::math::opencl_context::description() const’:
stan/lib/stan_math/stan/math/opencl/opencl_context.hpp:227:48: error: expected primary-expression before ‘<<’ token
msg << "Platform ID: " << OPENCL_DEVICE_ID << “\n”;

What do you have for opencl in make/local?


clinfo -l
Platform #0: NVIDIA CUDA
`-- Device #0: GeForce GTX 1050 Ti


the curly brackets are not needed there.

An example of a valid one make/local would be:


It works. Thank you. Make gives a warning that clock skew was detected but I guess this is because of make/local.

When I tried to compile my model it gave bunch of errors such as (although openCL is included):
g++ -std=c++1y -pthread -Wno-sign-compare -I stan/lib/stan_math/lib/opencl_1.2.8 -O3 -I src -I stan/src -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.3 -I stan/lib/stan_math/lib/boost_1.69.0 -I stan/lib/stan_math/lib/sundials_4.1.0/include -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -DSTAN_OPENCL -DOPENCL_DEVICE_ID=0 -DOPENCL_PLATFORM_ID=0 -DCL_USE_DEPRECATED_OPENCL_1_2_APIS -D__CL_ENABLE_EXCEPTIONS -Wno-ignored-attributes -lOpenCL src/cmdstan/main.o stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_idas.a /home/eval/lmockus/cmdstan-2.20.0.gpu/cliff/cliff.o -o /home/eval/lmockus/cmdstan-2.20.0.gpu/cliff/cliff
src/cmdstan/main.o: In function cl::detail::getPlatformVersion(_cl_platform_id*)': main.cpp:(.text+0x38): undefined reference to clGetPlatformInfo’
main.cpp:(.text+0x63): undefined reference to `clGetPlatformInfo’

That means that the linker cant find the OpenCL library to link. Are you using Windows or Linux? On Linux that should work if you installed the driver normally. On windows you need to set a flag.

Windows flag: LDFLAGS_OPENCL= -L"$(CUDA_PATH)\lib\x64" -lOpenCL

1 Like

I just checked is in /usr/lib/x86_64-linux-gnu
I installed it as in

Yeah, that all seems fine.

I keep forgetting that we had a bug in 2.20, that we fixed a day or two after the release, but there was no hotfix release. The next release is coming in 9 days.

For the time being I would recommend cloning the latest develop (git clone --single-branch --recursive). That one does require git unfortunately.

Can you share anything about the model you are trying to speed up? Thanks.

git is fine. Do you recommend git?

The model is in cliff.stan (3.3 KB)
It is a time series model with neural network instead of ar(1). It runs very slowly but uses matrix mult so I thought GPU might speed it up. I am also thinking about adding a threading in order to use all available cores.

Yes, I would recommend cloning with git.

There are quite a few matrix multiplications in here, so you should see some speedup here, depending on the sizes. At the moment I think you would benefit from threading more, given my quick inspection of the model. Except if the matrix multiplications are 200x200 times 200x200 or larger.

If you are interested in threading I recommend this tutorial:

Actually it is 200x10 matrices. Refactoring into map_rect form is bit more complicated - the model is big enough already. Design matrix for each year (X.) is different for each year and the calculations have to be done year by year. Perhaps each shard should contain data for each year? I am just thinking loudly. It means that each shard should have unequal number of data points. Hopefully durable… The problem I am encountering is “trace ran beyond…” which kills sampler - I posted about it few days ago - so when it is resolved I will proceed with multithreading.