GPU functions in rstan

wigglyhypersurface · March 17, 2020, 4:51pm

Hi folks,

Now that rstan is using stan version 2.19.1, that means some of the new GPU enabled functions are available in a properly configured rstan yes? This should make them available to the interfaces like brms if rstan is set up correctly? The documentation for the stan gpu routines currently doesn’t have any information about how to set things up for rstan specifically. Where is the rstan equivalent of where I need to put:

STAN_OPENCL=true
OPENCL_DEVICE_ID=${CHOSEN_INDEX}
OPENCL_PLATFORM_ID=${CHOSEN_INDEX}

Operating System: Windows 10
brms Version: 2.11.1

bgoodri · March 17, 2020, 7:33pm

I think that is the correct syntax for your ~/.R/Makevars.win but the GPU only comes on if you are taking the Cholesky factorization of a matrix that is 1200 x 1200 or bigger. And I don’t think (m)any brms models are even doing Cholesky factorizations.

paul.buerkner · March 18, 2020, 10:58am

Gaussian process models (with the gp() function) will make use of it. Otherwise, cholesky factorization within the sampling process is not done explicitely for brms models (but may be happening by Stan itself behind the scenes).

Max_Mantei · March 18, 2020, 1:42pm

Do you mean there are no speed-ups before 1200x1200?! Or is that some kind of hard coded threshold ncol(X) > 1200 ? GPU, CPU; (weird pseudo code, haha).

mcol · March 18, 2020, 2:40pm

In cholesky_decompose.hpp there’s a check that happens if GPUs are available:

if (m.rows() >= opencl_context.tuning_opts().cholesky_size_worth_transfer)

in which case the operation is done using the GPU. That cholesky_size_worth_transfer is defined as 1250 (I don’t know if it can be controlled from some opencl configuration file) as a compromise between the cost of transferring data to/from the GPU and the speed up gains obtained by using GPUs. For smaller values, the costs exceed the gains, so computations are left on the CPU. I couldn’t find the PR in which that value was chosen, but I’m pretty sure there was some testing behind it.

Max_Mantei · March 18, 2020, 2:57pm

Good to know, thanks! :)

stevebronder · March 18, 2020, 6:31pm

To get the gpu stuff working with rstan/brms I add the below to my makevars. I need to figure out a solution to pulling out the OpenCL headers

gist.github.com

https://gist.github.com/SteveBronder/318844e588b7f9243bfaa5d595ee7980

stan_gpu_flags

USER_OPTIM_FLAGS= -pipe -fPIC -O3 -mtune=native -march=native
# I couldn't tell whether the opencl headers we use existed in `StanHeaders` on cran
#  So I do a 
# git clone --recursive https://github.com/stan-dev/rstan
# and then include the OpenCL headers
USER_OPENCL_FLAGS= -I"/path_to_rstan/rstan/StanHeaders/inst/include/mathlib/lib/opencl_2.1.0"
# You can get these with clinfo -l
USER_OPENCL_FLAGS+= -DSTAN_OPENCL=1 -DOPENCL_DEVICE_ID=1 -DOPENCL_PLATFORM_ID=2 -lOpenCL
# Some extra bits we need
USER_OPENCL_FLAGS+= -DCL_HPP_TARGET_OPENCL_VERSION=120 -DCL_HPP_MINIMUM_OPENCL_VERSION=120 -DCL_HPP_ENABLE_EXCEPTIONS -Wno-ignored-attributes

This file has been truncated. show original

The list below needs updated, but at the bottom of the docs we have a list of the functions that can use the GPU backend.

http://mc-stan.org/math/opencl_support.html

The big bummer with GPUs in general is the cost of transferring data to and from the GPU is v high. For cholesky, every iteration we need to pass the value and adjoints of the matrix that holds stan’s autodiff class variable var. Though note for the glm methods we have some tricks in the compiler so it goes fast for much smaller problems. We set that threshold when we first started writing the gpu code, but a lot of performance improvements have happened since then so we should probs go back and check if that’s lower now. Though I wouldn’t expect it to be more than 1000 or so.

Topic		Replies	Views
GPU support in Rstan? Developers	9	1112	May 3, 2019
GPU integration for rstan 19.2 General	6	2191	July 20, 2019
GPU supported in rstan 2.19.x? General	3	3063	July 31, 2019
Setting up GPU for RStan on Windows 10 Developers	6	1900	July 18, 2022
OpenCL in BRMS with cmdstanr backend - making use of stan-math OpenCL functions Modeling performance , gpu	7	1183	January 5, 2021

GPU functions in rstan

Related topics