Stan is not working on GPU in Linux

jmaronas · March 16, 2021, 2:48pm

Hi,

I am trying to compile stan using the GPU. I am not sure if the problem is mine and want to ask first before opening an issue. I am testing this on two different settings getting different errors.

UBUNTU 20

On one side I am on a Linux machine with ubuntu 20. I have installed the drivers (last version 460) and nvidia-cuda-toolkit through apt-get, so the system is using a cuda10.1 compilation. CmdStan is compiled through cmdstanpy. Once installed I go directly into the directory where cmdstan is installed (i.e no python call) and execute : ./runTests.py test/unit -f opencl following all the instructiones outlined here Stan Math Library: OpenCL CPU/GPU Support. The computer has been compiling since 2 hours ago more or less and once finished it seems al the test have been performed sucessfully.

However, if I launch a simple stan program through the cmdstanpy interface. More precisely I execute:

## Compile the program
cpp_options = {
                #'STAN_THREADS'       : True , 
                #'STAN_CPP_OPTIMS'    : True, 
                'STAN_OPENCL'        : True ,
                'OPENCL_PLATFORM_ID' : 0    ,
                'OPENCL_DEVICE_ID'   : 0
    
              }

sm = CmdStanModel(stan_file='./stan_files/BNN.stan', cpp_options = cpp_options )

And get the following error:

ERROR:cmdstanpy:file /home/jmaronasm/stan/stan_files/BNN.stan, exception ERROR
In file included from stan/lib/stan_math/stan/math/opencl/prim.hpp:86,
                 from stan/lib/stan_math/stan/math/prim.hpp:7,
                 from stan/src/stan/io/dump.hpp:7,
                 from src/cmdstan/command.hpp:24,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/scalar_type.hpp:14:8: error: partial specialization of ‘struct stan::scalar_type<T, typename std::enable_if<stan::math::conjunction<stan::is_kernel_expression_and_not_scalar<T, void> >::value, void>::type>’ after instantiation of ‘struct stan::scalar_type<stan::math::constant_<int>, void>’ [-fpermissive]
   14 | struct scalar_type<T, require_all_kernel_expressions_and_none_scalar_t<T>> {
      |        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make: *** [make/program:14: src/cmdstan/main_opencl.o] Error 1 
ERROR:cmdstanpy:model compilation failed

Ubuntu 16

In another machine I have ubuntu 16 installed. In this particular case, the nvidia-cuda-toolkit provided through apt-get is quite old (cuda 7.5). For that reason I install cuda 11.0 directly with the last nvidia drivers (460), and add the cuda path to the PATH and LD_LIBRARY_PATH. In this case the error I get when compiling is different and much bigger than the one on UBUNTU 20:

ERROR:cmdstanpy:file /home/jmaronasm/Escritorio/phd/TRABAJANDO/CURRENT_PROJECTS/VIDRIOS_MODELO_JERÁRQUICO/FULLBayesian_HGM_Stan/stan_files/BNN.stan, exception ERROR
In file included from stan/lib/stan_math/stan/math/opencl/kernel_generator.hpp:133:0,
                 from stan/lib/stan_math/stan/math/opencl/rev/vari.hpp:7,
                 from stan/lib/stan_math/stan/math/rev/core/var.hpp:5,
                 from stan/lib/stan_math/stan/math/rev/core/profiling.hpp:6,
                 from src/cmdstan/write_profiling.hpp:4,
                 from src/cmdstan/command.hpp:17,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/kernel_generator/multi_result_kernel.hpp: In instantiation of ‘stan::math::results_cl<T_results>::operator=(const stan::math::expressions_cl<T_expressions ...>&)::<lambda(auto:18 ...)> [with auto:18 = {std::integral_constant<long unsigned int, 0ul>}; T_expressions = {const stan::math::addition_operator_<stan::math::load_<stan::math::matrix_cl<double, void>&>, stan::math::elt_multiply_<stan::math::load_<const stan::math::matrix_cl<double, void>&>, st an::math::trigamm  CONTINUES HERE

In both cases it seems the errors do not have to do with missing libraries in the linker or things like that. Any thoughts before opening an issue?

Thank you!

maxbiostat · March 16, 2021, 3:41pm

@mitzimorris

rok_cesnovar · March 16, 2021, 4:29pm

I am taking a look. There is one additional similar report on Discourse right now.

jmaronas · March 16, 2021, 4:57pm

thank you, please keep me on the loop if possible. Can you point me to the thread where this is being discoursed?

rok_cesnovar · March 16, 2021, 5:02pm

No other discussion going on right now, but lets continue in Partial specialization error when compiling model with opencl enabled

Seems that something is wrong with the 2.26.1 release wrt to this. I can replicate locally. Not sure what happened. It does work with 2.26.0. Will dig deeper and report back.

rok_cesnovar · March 16, 2021, 5:36pm

It seems that g++ 9.3.0 does not like something about our OpenCL backend in 2.26.1. Its fixed on develop but that obviously wont help you there.

Run:

cmdstan_make_local(cpp_options = "CXXFLAGS += -fpermissive")
rebuild_cmdstan(cores = 4)

and then try again.

The other solution is to switch to using clang++ for now:

cmdstan_make_local(cpp_options = "CXX = clang++")
rebuild_cmdstan(cores = 4)

Sorry for the inconvenience.

jmaronas · March 16, 2021, 6:10pm

As I am working in cmdstanpy and it seems there is no function as in R to make these changes, my solution has been to run:

install_cmdstan --overwrite --version 2.26.0

and then link to the installation through set_cmdstan_path Installation — CmdStanPy 0.9.64 documentation and now it works.

Posting it here for python users.

rok_cesnovar · March 16, 2021, 6:12pm

Oh, sorry, completely forgot your case is cmdstanpy. In that case i would advise going to the cmdstan installation folder and writing

CXXFLAGS += -fpermissive

to the make/local file. It most likely doesnt exist, so just create one.
I would advise against using 2.26.0.

jmaronas · March 17, 2021, 9:01am

That works, thank you.

Topic		Replies	Views
GPU compilation error Modeling	1	343	February 14, 2023
Cmdstanpy: Not working with google colab GPU CmdStan	8	1932	October 19, 2021
Failure to load opencl.h for cmdstanr on Linux system Interfaces cmdstan , cmdstanr	3	1934	March 17, 2023
Cannot get GPU to work on macOS with Intel GPU CmdStan	11	1480	August 19, 2020
Trouble with Using OpenCL in cmdstanr CmdStan	1	601	September 6, 2021

Stan is not working on GPU in Linux

UBUNTU 20

Ubuntu 16

Related topics