Cmdstanpy: Not working with google colab GPU

I want to run cmdstan using the GPU of google colab.

While it works on the CPU, it cannot be confirmed to work on the GPU…

cpp_options = {
                #'STAN_THREADS'       : True , 
                #'STAN_CPP_OPTIMS'    : True, 
                'STAN_OPENCL'        : True ,
                'OPENCL_PLATFORM_ID' : 0    ,
                'OPENCL_DEVICE_ID'   : 0}

sm = cmdstanpy.CmdStanModel(stan_file="/content/gdrive/MyDrive/colab/stan_code.stan", cpp_options= cpp_options)
fit = sm.sample(data=df, chains=2, iter_warmup=500,iter_sampling=500, show_progress=True)

---- output ----

Chain 1 - done: 0%| | 1/1000 [07:35<126:19:31, 455.23s/it]
Chain 2 - warmup: 0%| | 0/1 [07:36<?, ?it/s]

↑It’s freezing.

Hi @kyo219 - we have had reports in the past of the show_progress function acting strangely in notebooks. It’s all been re-written in 1.0 which should be out soon (or if you want to do a pip install from github).

That said, if it’s working on the CPU I suspect it is another issue.

@mitzimorris - I know you’ve used cmdstanpy in Google collab. Is the issue that cmdstan itself would need to be rebuilt with these flags, or could it be an issue with external dependencies? Not sure what opencl requires

I dont have any familiartiy with Google Colab so not sure I can help a ton but can try. The OpenCL dependencies that are required are listed here: 14 Parallelization | CmdStan User’s Guide

thank you guys.

when I try

sudo apt install nvidia-cuda-toolkit

then, I got this error message.

Preparing to unpack .../48-nvidia-cuda-dev_9.1.85-3ubuntu1_amd64.deb ...
dpkg: error processing archive /tmp/apt-dpkg-install-1gv7Ba/48-nvidia-cuda-dev_9.1.85-3ubuntu1_amd64.deb (--unpack): trying to overwrite '/usr/include/cublas.h', which is
also in package libcublas-dev error: paste subprocess was killed by signal(Broken pipe)

my environment

/content# clinfo -l
Platform #0: NVIDIA CUDA
 `-- Device #0: Tesla P100-PCIE-16GB

Thu Oct 14 10:39:31 2021       
| NVIDIA-SMI 470.74       Driver Version: 460.32.03    CUDA Version: 11.2     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   32C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

I also try cmdstanr in Google colab.

I got the same problem with cmdstanpy.

install.packages("cmdstanr", repos = c("", getOption("repos")))
install_cmdstan(cores=4, overwrite = TRUE)
sm <- cmdstan_model("/stan_code.stan",
                        cpp_options = list(stan_opencl = TRUE))

its done

But . There is a problem with the sampling.

fit <- sm$sample(data = data, chains = 4, parallel_chains = 4,
                        opencl_ids = c(0, 0),  refresh = 0)
Running MCMC with 4 parallel chains...

Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:

Chain 1 Exception: stan_code_model_namespace::log_prob: o_Tau is not symmetric. o_Tau[1,2] = inf, but o_Tau[2,1] = inf (in '/tmp/RtmpEBdlH8/model-4d182d8a1c.stan', line 43, column 8 to column 40)

Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,

Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.


↑ freeze

when I stop sampling( freeze)

see this error message

Chain 3 -   done:   0%|          | 0/1 [01:58<?, ?it/s]Chain 3 processing error, non-zero return code 1
 error message:
	Can't open specified file, "/tmp/tmpht4im148/jwfqze60.json"

Chain 4 -   done:   0%|          | 0/1 [01:58<?, ?it/s]Chain 4 processing error, non-zero return code 1
 error message:
	Can't open specified file, "/tmp/tmpht4im148/jwfqze60.json"

compile error

INFO:cmdstanpy:compiling stan program, exe file: /content/drive/MyDrive/eiv_gpu/m4_non
INFO:cmdstanpy:compiler options: stanc_options=None, cpp_options={'STAN_THREADS': True, 'STAN_OPENCL': 'TRUE', 'OPENCL_PLATFORM_ID': 0, 'OPENCL_DEVICE_ID': 0}
ERROR:cmdstanpy:file /content/drive/MyDrive/eiv_gpu/m4_non.stan, exception ERROR
 In file included from stan/lib/stan_math/stan/math/opencl/prim.hpp:200:0,
                 from stan/lib/stan_math/stan/math/prim.hpp:7,
                 from stan/src/stan/io/dump.hpp:7,
                 from src/cmdstan/command.hpp:30,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp: In function ‘stan::return_type_t<T_y_cl, T_loc_cl, T_scale_cl> stan::math::normal_lcdf(const T_y_cl&, const T_loc_cl&, const T_scale_cl&)’:
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:215:39: error: the value of ‘stan::math::internal::opencl_normal_lcdf_impl’ is not usable in a constant expression
   auto lcdf_n = opencl_code<internal::opencl_normal_lcdf_impl>(
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:16:12: note: ‘stan::math::internal::opencl_normal_lcdf_impl’ was not declared ‘constexpr’
 const char opencl_normal_lcdf_impl[] = STRINGIFY(
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:221:31: error: the value of ‘stan::math::internal::opencl_normal_lcdf_ldncdf_impl’ is not usable in a constant expression
       = opencl_code<internal::opencl_normal_lcdf_ldncdf_impl>(
stan/lib/stan_math/stan/math/opencl/prim/normal_lcdf.hpp:59:12: note: ‘stan::math::internal::opencl_normal_lcdf_ldncdf_impl’ was not declared ‘constexpr’
 const char opencl_normal_lcdf_ldncdf_impl[] = STRINGIFY(
In file included from stan/lib/stan_math/stan/math/opencl/prim.hpp:249:0,
                 from stan/lib/stan_math/stan/math/prim.hpp:7,
                 from stan/src/stan/io/dump.hpp:7,
                 from src/cmdstan/command.hpp:30,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp: In function ‘stan::return_type_t<T> stan::math::std_normal_lcdf(const T_y_cl&)’:
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:201:29: error: the value of ‘stan::math::internal::opencl_std_normal_lcdf_impl’ is not usable in a constant expression
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:16:12: note: ‘stan::math::internal::opencl_std_normal_lcdf_impl’ was not declared ‘constexpr’
 const char opencl_std_normal_lcdf_impl[] = STRINGIFY(
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:205:39: error: the value of ‘stan::math::internal::opencl_std_normal_lcdf_dnlcdf’ is not usable in a constant expression
   auto dnlcdf = opencl_code<internal::opencl_std_normal_lcdf_dnlcdf>(
stan/lib/stan_math/stan/math/opencl/prim/std_normal_lcdf.hpp:52:12: note: ‘stan::math::internal::opencl_std_normal_lcdf_dnlcdf’ was not declared ‘constexpr’
 const char opencl_std_normal_lcdf_dnlcdf[] = STRINGIFY(
In file included from stan/lib/stan_math/stan/math/opencl/kernel_generator.hpp:136:0,
                 from stan/lib/stan_math/stan/math/opencl/rev/vari.hpp:7,
                 from stan/lib/stan_math/stan/math/rev/core/var.hpp:5,
                 from stan/lib/stan_math/stan/math/rev/core/profiling.hpp:6,
                 from src/cmdstan/write_profiling.hpp:4,
                 from src/cmdstan/command.hpp:21,
                 from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/kernel_generator/multi_result_kernel.hpp: In instantiation of ‘stan::math::results_cl<T_results>::operator=(const stan::math::expressions_cl<T_expressions ...>&)::<lambda(auto:20 ...)> [with auto:20 = {std::integral_constant<long unsigned int, 0ul>}; T_expressions = {const stan::math::constant_<double>&}; <template-parameter-2-2> = void; T_results = {stan::math::matrix_cl<double>&}]’:


make: *** [src/cmdstan/main_threads_opencl.o] Error 1 
ERROR:cmdstanpy:model compilation failed

Can you try and compile this GLM example model:

data {
  int<lower=1> k;
  int<lower=0> n;
  matrix[n, k] X;
  int y[n];
parameters {
  vector[k] beta;
  real alpha;
model {
  target += std_normal_lpdf(beta);
  target += std_normal_lpdf(alpha);
  target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);