Building the GPU Cholesky branch

I have a model with 160 2000x2000 covariance matrices (a hierarchical Gaussian Process regression) that’ll take weeks on CPU. I thought I’d venture an attempt to build the pull-pending branch of stan-math that has Gpu-based Cholesky decomposition. I use the script below to set up a build on a Google compute engine instance with a Tesla k80 GPU running Ubuntu 16.04, but the final line attempting to run the GPU Cholesky decomposition test fails. Has anyone else tried this with success?

sudo su -
curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
apt-get update
apt-get install nvidia-cuda-dev nvidia-cuda-toolkit opencl-dev
git clone https://github.com/stan-dev/stan.git
cd stan
make math-update/gpu_cholesky
cd lib/stan_math/
nano makefile #edit so that  CC=g++


./runTests.py  'test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.cpp' --debug

No, but you might want to jump into the GitHub discussion on stan-dev/math.

Gotcha, will do.