I’m trying to speed up a model on an HPC by using the OpenCL support through CmdStan. To get everything up and running, I’ve had to create my own docker image (https://hub.docker.com/r/jofrhwld/stan-opencl). All of the diagnostics I include here are the result of sending a job to the cluster that runs the image.
My make/local file looks like
CXX = clang++
STAN_OPENCL=true
OPENCL_PLATFORM_ID=0
OPENCL_DEVICE_ID=0
I tried running the sample brms model from here like so
library(cmdstanr)
#This is cmdstanr version 0.8.1
#- CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#- CmdStan path: /usr/share/.cmdstan
#- CmdStan version: 2.35.0
library(brms)
options(
cmdstanr_verbose = TRUE
)
fit <- brm(count ~ zAge + zBase * Trt + (1|patient),
data = epilepsy, family = poisson(),
chains = 2, opencl = opencl(c(0, 0)),
backend = "cmdstanr")
The model compiles successfully, and begins sampling. The bottom of the results look like:
Chain 2 opencl
Chain 2 device = 0
Chain 2 platform = 0
Chain 2 opencl_platform_name = NVIDIA CUDA
Chain 2 opencl_device_name = Tesla P100-PCIE-12GB
Warning: Chain 2 finished unexpectedly!
Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Error: Fitting failed. Unable to retrieve the metadata.
In addition: Warning messages:
1: All chains finished unexpectedly! Use the $output(chain_id) method for more information.
2: No chains finished successfully. Unable to retrieve the fit.
Execution halted
Based on a previous question, I also tried
fit <- cmdstanr_example(chains = 1)
Which also compiled successfully, then results in
Chain 1 Unrecoverable error evaluating the log probability at the initial value.
Chain 1 Exception: compile_kernel: calculate : Unknown error -11 (in '/tmp/Rtmp2fQHM3/model-2c6a73b284efe.stan', line 13, column 2 to column 37)
Warning: Chain 1 finished unexpectedly!
Warning message:
No chains finished successfully. Unable to retrieve the fit.
Any suggestions for resolving this would be much appreciated!
EDIT:
When I run clinfo -l
I get back
Platform #0: NVIDIA CUDA
`-- Device #0: Tesla P100-PCIE-12GB