Tutorial on running STAN code on WSL

Tong_Zhao · February 23, 2025, 4:50pm

I am delighted to share my recent breakthrough in enabling GPU acceleration for STAN computations on WSL (Windows Subsystem for Linux). After persistent challenges with OpenCL implementation, I’ve successfully resolved this issue and would like to document the streamlined procedure for fellow developers and data scientists.

The solution involves three key phases:

PoCL implementation
R environment configuration
Validation and execution

Prerequisites

This guide assumes you have:

Functional WSL installation with Linux distribution
Operational RStudio Server
Installed cmdstanr package

1. PoCL Implementation

Given NVIDIA’s lack of native OpenCL support for WSL, we employ the Portable Computing Language (PoCL) framework.

Critical Preliminary Steps:

Remove any existing OpenCL installations in WSL (clinfo -l should return empty)
Follow this comprehensive guide for PoCL configuration

Implementation Workflow:

Host Machine Preparation

Install latest NVIDIA drivers on Windows host
Do not install GPU drivers within WSL
Reference: CUDA on WSL User Guide

CUDA Toolkit Installation

Install CUDA 12.4 (recommended for PyTorch compatibility) via official repository
Verify installation

nvidia-smi

PoCL Compilation

wget https://github.com/pocl/pocl/archive/refs/tags/v6.0.zip
unzip v6.0.zip && cd pocl-6.0 && mkdir build
cmake -B build \
  -DCMAKE_C_FLAGS=-L/usr/lib/wsl/lib \
  -DCMAKE_CXX_FLAGS=-L/usr/lib/wsl/lib \
  -DENABLE_HOST_CPU_DEVICES=OFF \
  -DENABLE_CUDA=ON
cmake --build build -j$(nproc)
echo 'export POCL_BUILDING=1' >> ~/.bashrc
echo 'export OCL_ICD_VENDORS=~/pocl-6.0/build/ocl-vendors/' >> ~/.bashrc
source ~/.bashrc
cmake --install build
sudo apt install clinfo

Verification:

clinfo --list
# Expected output: NVIDIA GeForce RTX 4090

2. R Environment Configuration

Resolve RStudio Server’s environment isolation:

Edit system-wide R configuration:

sudo vim /usr/lib/R/etc/Renviron.site

Append:

POCL_BUILDING=1
OCL_ICD_VENDORS=~/pocl-6.0/build/ocl-vendors/

Restart R session (Session -> Restart R)

3. Validation & Execution

OpenCL Verification:

OpenCL::oclPlatforms()

STAN Implementation:

data {
  int<lower=1> k;
  int<lower=0> n;
  matrix[n, k] X;
  array[n] int y;
}
parameters {
  vector[k] beta;
  real alpha;
}
model {
  target += std_normal_lpdf(beta);
  target += std_normal_lpdf(alpha);
  target += bernoulli_logit_glm_lpmf(y | X, alpha, beta);
}

library(cmdstanr)

# Synthetic dataset
n <- 250000
k <- 20
X <- matrix(rnorm(n * k), ncol = k)
y <- rbinom(n, size = 1, prob = plogis(3 * X[,1] - 2 * X[,2] + 1))
mdata <- list(k = k, n = n, y = y, X = X)

# GPU-accelerated compilation
mod_cl <- cmdstan_model("opencl-files/bernoulli_logit_glm.stan",
                        cpp_options = list(stan_opencl = TRUE))

Performance Monitoring:

Utilize nvitop for real-time GPU utilization monitoring. Successful implementation will demonstrate significant GPU workload during computation.

Topic		Replies	Views
Running Stan on the GPU with OpenCL on WSL: Seeking Assistance CmdStan linux , techniques , gpu	18	693	February 23, 2025
Failed to run opencl demo on a Windows machine CmdStan techniques	2	813	January 14, 2022
Enable GPU in Stan Modeling rstan , techniques , gpu , cmdstanr	2	1833	January 23, 2024
Help with OpenCL for Windows CmdStan	2	893	November 30, 2022
Failed to Execute Stan's `runTests.py` on WSL2 Interfaces installation , linux , gpu , stan , cmdstanr	6	48	August 16, 2024

Tutorial on running STAN code on WSL

Prerequisites

1. PoCL Implementation

2. R Environment Configuration

3. Validation & Execution

Related topics