GPU setup on AWS EC2 with Docker

nerutenbeck · January 23, 2021, 6:05am

Hi there,

I’m trying to set up an AWS G4 instance to get going using a GPU. At this point I’m just trying to get the Hello World bernoulli in the docs to run using CmdStanPy in our Docker container.

I first did:

apt-get update && apt-get install ocl-icd-opencl-dev nvidia-cuda-toolkit clinfo

then tried to fit the model:


import os
from cmdstanpy import cmdstan_path, CmdStanModel

bernoulli_stan = os.path.join(cmdstan_path(), 'examples', 'bernoulli', 'bernoulli.stan')

bernoulli_model = CmdStanModel(
    stan_file=bernoulli_stan,
    cpp_options={"STAN_OPENCL": True},
    )

bernoulli_data = os.path.join(
    cmdstan_path(), 'examples', 'bernoulli', 'bernoulli.data.json'
    )

bern_fit = bernoulli_model.sample(
    data=bernoulli_data,
    output_dir='/opt'
    )

but hit the following in stdout:

opencl_context: clGetPlatformIDs CL_PLATFORM_NOT_FOUND_KHR: Unknown error -1001

After a little research on the GPU threads here I started following instructions here.

I’m still stumped, though. Not sure if I missed some crucial documentation somewhere or if there are other tricks I need to be aware of, but the result of calling clinfo is:

Number of platforms                               0

The result of cat /etc/OpenCL/vendors/* is:

libnvidia-opencl.so.1

so am not exactly sure how to proceed with step 3 in the doc I linked to. Would appreciate any assistance or links to docs, or a redirect if I’m entirely misguided here.

Thanks for your help!

rok_cesnovar · January 23, 2021, 7:01am

I think the only thing missing is a reboot of the instance.

The driver is installed with the cuda-toolkit so that should be good.

rok_cesnovar · January 23, 2021, 11:44am

I apologize, it seems that installing the driver separately is required. So the minimal instructions are

sudo apt-get -y update
sudo apt-get install -y nvidia-driver-460 nvidia-cuda-toolkit clinfo

Someone seeing this at a later point should replaced 460 with a larger number if available.

nerutenbeck · January 26, 2021, 12:20pm

@rok_cesnovar thanks for the suggestions! Still having trouble here, though. I think it is perhaps made a little more complicated by the use of a docker image. My hunch is I probably need to install the drivers and toolchain outside Docker, which took me to the AWS NVIDIA driver install guide. I’m not sure how that matches up with what CmdStan expects. In any case, when I figure out the full toolchain install I’ll post an update for posterity.

spinkney · January 27, 2021, 2:49pm

hi @nerutenbeck , I’m going to add a stan blog post to set up rstudio and stan on ec2. Do you have any interest in doing one to set up gpu?

nerutenbeck · January 27, 2021, 3:50pm

@spinkney I’d love to! Thanks for the offer. DM me to briefly discuss timing and details? I’m also on the Stan slack channel if you’d rather discuss there.

Amos · July 23, 2021, 9:19pm

Did the stan on ec2-gpu blog post ever go up? If so could you provide a link?

chris1 · June 2, 2023, 1:51am

Hi, it’s nearly 2 years later but I had similar errors trying to set up GPUs on a HPC platform, using Apptainer instead of Docker.

The first problem I had was with the output of
$ clinfo
Number of platforms 0

This is because the container needs to be told to use GPUs e.g. for Apptainer with the --nv flag when executing/running, and for docker the --gpus all flag. Then the container can find the platform and device when running clinfo -l.

The second was the following error:
opencl_context: clGetPlatformIDs CL_PLATFORM_NOT_FOUND_KHR: Unknown error -1001

This was solved by installing some additional dependencies when building the container
apt-get -y install pocl-opencl-icd nvidia-settings
which I found from reading this reference.

I hope that helps anyone who is similarly stuck.
Chris

Jacob_stanlearner · June 26, 2023, 3:49pm

Thank you for your help! The solution “apt-get -y install pocl-opencl-icd nvidia-settings” worked for me. I want to use brms in Google Colab (runtime type: R) to run some models with GPU acceleration. For example, the following code:

m1 <- brm(y~ x+(1|id), family=negbinomial, data = df, warmup = 1000, iter = 2000, cores=4,chains = 4, opencl = opencl(c(0, 0)), backend="cmdstanr")
However, it will returns many errors, mainly about “opencl”.

However, it returned many errors, mainly related to “opencl”. Then, I tried running this line in the Colab terminal: “apt-get -y install pocl-opencl-icd nvidia-settings”. Amazingly, it worked!

Hope this can help others who what to run brms with GPU on the google colab!

Topic		Replies	Views
Cannot get GPU to work on macOS with Intel GPU CmdStan	11	1478	August 19, 2020
Make GPU examples cmdstan 2.20.0 Developers	13	862	December 11, 2019
GP Regression Example with Stan on GPU CmdStan stanc	1	745	February 9, 2020
Issue using external GPU with cmdstanr and OpenCL Modeling fitting-issues	5	464	October 28, 2023
Help setting up for GPU computation (OSX) Modeling gpu	5	1809	January 28, 2021

GPU setup on AWS EC2 with Docker

Related topics