I have a model with 160 2000x2000 covariance matrices (a hierarchical Gaussian Process regression) that’ll take weeks on CPU. I thought I’d venture an attempt to build the pull-pending branch of stan-math that has Gpu-based Cholesky decomposition. I use the script below to set up a build on a Google compute engine instance with a Tesla k80 GPU running Ubuntu 16.04, but the final line attempting to run the GPU Cholesky decomposition test fails. Has anyone else tried this with success?
sudo su -
curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
apt-get install nvidia-cuda-dev nvidia-cuda-toolkit opencl-dev
git clone https://github.com/stan-dev/stan.git
nano makefile #edit so that CC=g++
./runTests.py 'test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.cpp' --debug