I am running a hidden Markov model on a Linux server. I hoped to gain speed improvements by accessing a GPU instead of a CPU, but it takes ~ five times as long.
The GPU uses a NVIDIA-SMI 440.33.01. One thing I noticed is that the Nvidia driver automatically reduces the OpenCL version I am loading.
If I load module load gnu/7.4.0 opencl/2.2 instead of module load gnu/7.4.0 cuda/10.2, the estimation still generates no load on the GPU and runs very slowly.
Following the tutorial to run cmdstan on GPU, I don’t have to make any adjustment to my model to run it with cmdstan on a GPU. So I am wondering if my model cannot be more efficient on a GPU due to its structure. Is there anything else I can try?
the list of currently speedup up functions is given here:
For now, the only functions that get a speedup are the lpdf/lpmf distribution functions, the cholesky decomposition and matrix multiply. For the rest you will have to wait for Stan 2.27 or tinker with C++ (the backend support is already there it just not used ATM).
Its fairly fresh so I know it does not yet get picked up in google when you type OpenCL or GPU and Stan. So just taking this opportunity to share it again.
Thanks. My model has lpdf/lpmf distribution functions and matrix multiplications, so I expected at least to have a bit of speed improvement. What makes me wonder is that the model runs way slower on GPU.
Is there anything I can try to see if my installation is correct?
I noticed that you added a message that provides a link to the “best” guide. I have checked it and this is what I did…
Yes, that instructions you pointed are good, its just for Stan Math and that can confuse some people. Clearly it didnt confuse you which is good :) I just linked them again so they get more exposure.
Well, this really depends on the input sizes and which lpdf you are using. For example the poisson distribution can be much faster while the bernoulli one is only faster slightly. If you cant share the model I would advise profiing your model and seeing the bottlenecks and where the slowdown comes from.