Cmdstanpy: Not working with google colab GPU

kyo219 · October 13, 2021, 6:39pm

I want to run cmdstan using the GPU of google colab.

While it works on the CPU, it cannot be confirmed to work on the GPU…


cpp_options = {
                #'STAN_THREADS'       : True , 
                #'STAN_CPP_OPTIMS'    : True, 
                'STAN_OPENCL'        : True ,
                'OPENCL_PLATFORM_ID' : 0    ,
                'OPENCL_DEVICE_ID'   : 0}

sm = cmdstanpy.CmdStanModel(stan_file="/content/gdrive/MyDrive/colab/stan_code.stan", cpp_options= cpp_options)

fit = sm.sample(data=df, chains=2, iter_warmup=500,iter_sampling=500, show_progress=True)

---- output ----

Chain 1 - done: 0%| | 1/1000 [07:35<126:19:31, 455.23s/it]
Chain 2 - warmup: 0%| | 0/1 [07:36<?, ?it/s]

↑It’s freezing.

WardBrian · October 13, 2021, 7:11pm

Hi @kyo219 - we have had reports in the past of the show_progress function acting strangely in notebooks. It’s all been re-written in 1.0 which should be out soon (or if you want to do a pip install from github).

That said, if it’s working on the CPU I suspect it is another issue.

@mitzimorris - I know you’ve used cmdstanpy in Google collab. Is the issue that cmdstan itself would need to be rebuilt with these flags, or could it be an issue with external dependencies? Not sure what opencl requires

rok_cesnovar · October 13, 2021, 7:22pm

I dont have any familiartiy with Google Colab so not sure I can help a ton but can try. The OpenCL dependencies that are required are listed here: 14 Parallelization | CmdStan User’s Guide

kyo219 · October 14, 2021, 10:47am

thank you guys.

when I try

sudo apt install nvidia-cuda-toolkit

then, I got this error message.

Preparing to unpack .../48-nvidia-cuda-dev_9.1.85-3ubuntu1_amd64.deb ...
dpkg: error processing archive /tmp/apt-dpkg-install-1gv7Ba/48-nvidia-cuda-dev_9.1.85-3ubuntu1_amd64.deb (--unpack): trying to overwrite '/usr/include/cublas.h', which is
also in package libcublas-dev 10.2.1.243-1dpkg-deb: error: paste subprocess was killed by signal(Broken pipe)

kyo219 · October 14, 2021, 11:05am

my environment

/content# clinfo -l
Platform #0: NVIDIA CUDA
 `-- Device #0: Tesla P100-PCIE-16GB

!nvidia-smi

Thu Oct 14 10:39:31 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   32C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


!nvcc --version


nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

kyo219 · October 14, 2021, 3:12pm

I also try cmdstanr in Google colab.

I got the same problem with cmdstanpy.

install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
library(cmdstanr)
install_cmdstan(cores=4, overwrite = TRUE)

sm <- cmdstan_model("/stan_code.stan",
                        cpp_options = list(stan_opencl = TRUE))

its done

But . There is a problem with the sampling.

fit <- sm$sample(data = data, chains = 4, parallel_chains = 4,
                        opencl_ids = c(0, 0),  refresh = 0)

Running MCMC with 4 parallel chains...

Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:

Chain 1 Exception: stan_code_model_namespace::log_prob: o_Tau is not symmetric. o_Tau[1,2] = inf, but o_Tau[2,1] = inf (in '/tmp/RtmpEBdlH8/model-4d182d8a1c.stan', line 43, column 8 to column 40)

Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,

Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

.....

↑ freeze

kyo219 · October 17, 2021, 9:05am

when I stop sampling( freeze)

see this error message

Chain 3 -   done:   0%|          | 0/1 [01:58<?, ?it/s]Chain 3 processing error, non-zero return code 1
 error message:
	Can't open specified file, "/tmp/tmpht4im148/jwfqze60.json"

Chain 4 -   done:   0%|          | 0/1 [01:58<?, ?it/s]Chain 4 processing error, non-zero return code 1
 error message:
	Can't open specified file, "/tmp/tmpht4im148/jwfqze60.json"

kyo219 · October 19, 2021, 5:16am

compile error

INFO:cmdstanpy:compiling stan program, exe file: /content/drive/MyDrive/eiv_gpu/m4_non
INFO:cmdstanpy:compiler options: stanc_options=None, cpp_options={'STAN_THREADS': True, 'STAN_OPENCL': 'TRUE', 'OPENCL_PLATFORM_ID': 0, 'OPENCL_DEVICE_ID': 0}
ERROR:cmdstanpy:file /content/drive/MyDrive/eiv_gpu/m4_non.stan, exception ERROR
 In file included from stan/lib/stan_math/stan/math/opencl/prim.hpp:200:0,
                 from stan/lib/stan_math/stan/math/prim.hpp:7,
                 from stan/src/stan/io/dump.hpp:7,
                 from src/cmdstan/command.hpp:30,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp: In function ‘stan::return_type_t<T_y_cl, T_loc_cl, T_scale_cl> stan::math::normal_lcdf(const T_y_cl&, const T_loc_cl&, const T_scale_cl&)’:
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:215:39: error: the value of ‘stan::math::internal::opencl_normal_lcdf_impl’ is not usable in a constant expression
   auto lcdf_n = opencl_code<internal::opencl_normal_lcdf_impl>(
                                       ^~~~~~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:16:12: note: ‘stan::math::internal::opencl_normal_lcdf_impl’ was not declared ‘constexpr’
 const char opencl_normal_lcdf_impl[] = STRINGIFY(
            ^~~~~~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:221:31: error: the value of ‘stan::math::internal::opencl_normal_lcdf_ldncdf_impl’ is not usable in a constant expression
       = opencl_code<internal::opencl_normal_lcdf_ldncdf_impl>(
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:59:12: note: ‘stan::math::internal::opencl_normal_lcdf_ldncdf_impl’ was not declared ‘constexpr’
 const char opencl_normal_lcdf_ldncdf_impl[] = STRINGIFY(
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from stan/lib/stan_math/stan/math/opencl/prim.hpp:249:0,
                 from stan/lib/stan_math/stan/math/prim.hpp:7,
                 from stan/src/stan/io/dump.hpp:7,
                 from src/cmdstan/command.hpp:30,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp: In function ‘stan::return_type_t<T> stan::math::std_normal_lcdf(const T_y_cl&)’:
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:201:29: error: the value of ‘stan::math::internal::opencl_std_normal_lcdf_impl’ is not usable in a constant expression
       opencl_code<internal::opencl_std_normal_lcdf_impl>(
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:16:12: note: ‘stan::math::internal::opencl_std_normal_lcdf_impl’ was not declared ‘constexpr’
 const char opencl_std_normal_lcdf_impl[] = STRINGIFY(
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:205:39: error: the value of ‘stan::math::internal::opencl_std_normal_lcdf_dnlcdf’ is not usable in a constant expression
   auto dnlcdf = opencl_code<internal::opencl_std_normal_lcdf_dnlcdf>(
                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:52:12: note: ‘stan::math::internal::opencl_std_normal_lcdf_dnlcdf’ was not declared ‘constexpr’
 const char opencl_std_normal_lcdf_dnlcdf[] = STRINGIFY(
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from stan/lib/stan_math/stan/math/opencl/kernel_generator.hpp:136:0,
                 from stan/lib/stan_math/stan/math/opencl/rev/vari.hpp:7,
                 from stan/lib/stan_math/stan/math/rev/core/var.hpp:5,
                 from stan/lib/stan_math/stan/math/rev/core/profiling.hpp:6,
                 from src/cmdstan/write_profiling.hpp:4,
                 from src/cmdstan/command.hpp:21,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/kernel_generator/multi_result_kernel.hpp: In instantiation of ‘stan::math::results_cl<T_results>::operator=(const stan::math::expressions_cl<T_expressions ...>&)::<lambda(auto:20 ...)> [with auto:20 = {std::integral_constant<long unsigned int, 0ul>}; T_expressions = {const stan::math::constant_<double>&}; <template-parameter-2-2> = void; T_results = {stan::math::matrix_cl<double>&}]’:



.....




make: *** [src/cmdstan/main_threads_opencl.o] Error 1 
ERROR:cmdstanpy:model compilation failed

rok_cesnovar · October 19, 2021, 7:16am

Can you try and compile this GLM example model:

data {
  int<lower=1> k;
  int<lower=0> n;
  matrix[n, k] X;
  int y[n];
}
parameters {
  vector[k] beta;
  real alpha;
}
model {
  target += std_normal_lpdf(beta);
  target += std_normal_lpdf(alpha);
  target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
}

Topic		Replies	Views
Getting runtimeerror: ERROR when running cmdstanpy tutorial on google colab Other cmdstanpy	1	740	April 26, 2021
Cmdstan samples extremely slowly with GPU CmdStan cmdstanr	15	2214	July 5, 2023
Stan GPU flags Developers	12	2475	August 10, 2019
Enable GPU in Stan Modeling rstan , techniques , gpu , cmdstanr	2	1907	January 23, 2024
CmdStan: CPU faster than GPU? General cmdstan	6	2288	February 18, 2021

Cmdstanpy: Not working with google colab GPU

Related topics