I previously reported that I failed to fit a model using CmdStanR on the GPU with OpenCL on WSL2, encountering the error Chain <CHAIN_NUMBER> OpenCL Initialization: [Device] CL_INVALID_DEVICE: Unknown error -1
, in this thread. After reading Help setting up for GPU computation (OSX), I realised that I might have failed to execute runTests.py
as described in Stan Math Library: OpenCL CPU/GPU Support and Quick Start · stan-dev/math Wiki. I would like to confirm if this is indeed the case.
Question
Following the articles above, I executed runTests.py
as described below:
cd ~/.cmdstan/cmdstan-2.35.0/stan/lib/stan_math/
python3 runTests.py test/unit -f opencl
A large number of tests ran, but for all tests within the visible command line history, I saw Running 0 tests from 0 test suites.
For example:
------------------------------------------------------------
test/unit/math/opencl/cholesky_decompose_test --gtest_output="xml:test/unit/math/opencl/cholesky_decompose_test.xml"
Running main() from lib/benchmark_1.5.1/googletest/googletest/src/gtest_main.cc
[==========] Running 0 tests from 0 test suites.
[==========] 0 tests from 0 test suites ran. (0 ms total)
[ PASSED ] 0 tests
Does this indicate that the tests are failing? If the tests were successful, what should the output look like?
My environments
OS, programming languages, and hardwares
-
Operating System: Ubuntu 24.04 LTS on Windows Subsystem for Linux 2 (WSL2)
- All programs below were installed and run within the WSL2 environment, NOT the native Windows environment.
-
CmdStan Version: CmdStan v2.35.0
-
R 4.4.1
- cmdstanr: 0.8.1
-
Python 3.12.3
-
Compiler/Toolkit:
- CUDA 12.1
$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0
- GPU: NVIDIA RTX 3060 (12GB of dedicated memory)
$ clinfo -l Platform #0: Portable Computing Language `-- Device #0: NVIDIA GeForce RTX 3060
- CPU: Intel Core i9-10980XE (18 cores, 36 threads)
- CUDA 12.1
Installation of PoCL
Initially, clinfo -l
did not display any platforms or devices, due to the issue described in a GitHub issue entitled No OpenCL platforms reported · Issue #6951 · microsoft/WSL. Following the solution provided in a comment of the issue thread, I successfully installed PoCL, and now clinfo
and clinfo -l
recognise my GPU.
I finally installed PoCL as follows (In fact, I repeated the installation several times with different settings, ensuring to run xargs rm < install_manifest.txt
in the pocl-6.0/build
directory and deleting the pocl-6.0/build
directory before each reinstallation):
- Executed the following commands to install PoCL as per the official PoCL installation guide:
export LLVM_VERSION=18 apt install -y python3-dev libpython3-dev build-essential ocl-icd-libopencl1 \ cmake git pkg-config libclang-${LLVM_VERSION}-dev clang-${LLVM_VERSION} \ llvm-${LLVM_VERSION} make ninja-build ocl-icd-libopencl1 ocl-icd-dev \ ocl-icd-opencl-dev libhwloc-dev zlib1g zlib1g-dev clinfo dialog apt-utils \ libxml2-dev libclang-cpp${LLVM_VERSION}-dev libclang-cpp${LLVM_VERSION} \ llvm-${LLVM_VERSION}-dev
- Downloaded PoCL:
wget https://github.com/pocl/pocl/archive/refs/tags/v6.0.tar.gz
- Extracted the tarball:
tar -xzvf v6.0.tar.gz
- Changed to the PoCL directory:
cd pocl-6.0
- Created a build directory:
mkdir build
- Built PoCL following the instructions from the GitHub issue comment:
cmake -B build \ -DCMAKE_C_FLAGS=-L/usr/lib/wsl/lib \ -DCMAKE_CXX_FLAGS=-L/usr/lib/wsl/lib \ -DENABLE_HOST_CPU_DEVICES=OFF \ # Having both CPU and GPU simultaneously can cause issues https://github.com/pocl/pocl/issues/853#issuecomment-696367623 -DENABLE_CUDA=ON \ -DWITH_LLVM_CONFIG=/usr/bin/llvm-config-${LLVM_VERSION} \ # https://forums.developer.nvidia.com/t/need-support-to-run-opencl-application-on-tx2-board/264420/4 -DENABLE_EXAMPLES=ON # To install CUDA test: NVIDIA GPU support — Portable Computing Language (PoCL) 6.0 documentation https://portablecl.org/docs/html/cuda.html#run-tests
- Compiled PoCL:
cmake --build build -j34
- Added environment variables to
.bashrc
:echo 'export POCL_BUILDING=1' >> ~/.bashrc echo 'export OCL_ICD_VENDORS=<FULL_PATH_OF_MY_HOME_DIR>/pocl-6.0/build/ocl-vendors/' >> ~/.bashrc sudo nano /etc/OpenCL/vendors/nvidia.icd # Default is `/libnvidia-opencl.so` but there is no such a file in that path. # Therefore, replace the default line with the following line: # export /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.550.90.07 source ~/.bashrc
- Installed PoCL:
cmake --install build
- Verified GPU recognition with
clinfo --list
:$ clinfo --list Platform #0: Portable Computing Language `-- Device #0: NVIDIA GeForce RTX 3060
In the aforementioned Help setting up for GPU computation (OSX) - Modeling - The Stan Forums, @rok_cesnovar responded, so I am tagging you here in case you have any insights (apologies if this is inconvenient). Comments from others are also highly welcome. Any ideas or suggestions would be greatly appreciated…!