Hi,

I am trying to compile stan using the GPU. I am not sure if the problem is mine and want to ask first before opening an issue. I am testing this on two different settings getting different errors.

## UBUNTU 20

On one side I am on a Linux machine with ubuntu 20. I have installed the drivers (last version 460) and `nvidia-cuda-toolkit`

through `apt-get`

, so the system is using a cuda10.1 compilation. CmdStan is compiled through `cmdstanpy`

. Once installed I go directly into the directory where cmdstan is installed (i.e no python call) and execute : `./runTests.py test/unit -f opencl`

following all the instructiones outlined here Stan Math Library: OpenCL CPU/GPU Support. The computer has been compiling since 2 hours ago more or less and once finished it seems al the test have been performed sucessfully.

However, if I launch a simple stan program through the `cmdstanpy`

interface. More precisely I execute:

```
## Compile the program
cpp_options = {
#'STAN_THREADS' : True ,
#'STAN_CPP_OPTIMS' : True,
'STAN_OPENCL' : True ,
'OPENCL_PLATFORM_ID' : 0 ,
'OPENCL_DEVICE_ID' : 0
}
sm = CmdStanModel(stan_file='./stan_files/BNN.stan', cpp_options = cpp_options )
```

And get the following error:

```
ERROR:cmdstanpy:file /home/jmaronasm/stan/stan_files/BNN.stan, exception ERROR
In file included from stan/lib/stan_math/stan/math/opencl/prim.hpp:86,
from stan/lib/stan_math/stan/math/prim.hpp:7,
from stan/src/stan/io/dump.hpp:7,
from src/cmdstan/command.hpp:24,
from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/scalar_type.hpp:14:8: error: partial specialization of ‘struct stan::scalar_type<T, typename std::enable_if<stan::math::conjunction<stan::is_kernel_expression_and_not_scalar<T, void> >::value, void>::type>’ after instantiation of ‘struct stan::scalar_type<stan::math::constant_<int>, void>’ [-fpermissive]
14 | struct scalar_type<T, require_all_kernel_expressions_and_none_scalar_t<T>> {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make: *** [make/program:14: src/cmdstan/main_opencl.o] Error 1
ERROR:cmdstanpy:model compilation failed
```

### Ubuntu 16

In another machine I have ubuntu 16 installed. In this particular case, the `nvidia-cuda-toolkit`

provided through apt-get is quite old (cuda 7.5). For that reason I install `cuda 11.0`

directly with the last nvidia drivers (460), and add the cuda path to the `PATH`

and `LD_LIBRARY_PATH`

. In this case the error I get when compiling is different and much bigger than the one on UBUNTU 20:

```
ERROR:cmdstanpy:file /home/jmaronasm/Escritorio/phd/TRABAJANDO/CURRENT_PROJECTS/VIDRIOS_MODELO_JERÁRQUICO/FULLBayesian_HGM_Stan/stan_files/BNN.stan, exception ERROR
In file included from stan/lib/stan_math/stan/math/opencl/kernel_generator.hpp:133:0,
from stan/lib/stan_math/stan/math/opencl/rev/vari.hpp:7,
from stan/lib/stan_math/stan/math/rev/core/var.hpp:5,
from stan/lib/stan_math/stan/math/rev/core/profiling.hpp:6,
from src/cmdstan/write_profiling.hpp:4,
from src/cmdstan/command.hpp:17,
from src/cmdstan/main.cpp:1:
stan/lib/stan_math/stan/math/opencl/kernel_generator/multi_result_kernel.hpp: In instantiation of ‘stan::math::results_cl<T_results>::operator=(const stan::math::expressions_cl<T_expressions ...>&)::<lambda(auto:18 ...)> [with auto:18 = {std::integral_constant<long unsigned int, 0ul>}; T_expressions = {const stan::math::addition_operator_<stan::math::load_<stan::math::matrix_cl<double, void>&>, stan::math::elt_multiply_<stan::math::load_<const stan::math::matrix_cl<double, void>&>, st an::math::trigamm CONTINUES HERE
```

In both cases it seems the errors do not have to do with missing libraries in the linker or things like that. Any thoughts before opening an issue?

Thank you!