I’m trying to run a model where I define the log likelihood of a model using a user-defined function that involves a lot of matrix math (multiplications and inverses). A GPU is installed on my machine and OpenCL recognises it but running the variational algorithm offers no speedup at all, when run as:

```
model = cmdstan_model('model.stan', cpp_options=list(stan_opencl=TRUE))
results = model$variational(
data=data,
seed=42,
iter=2000,
opencl_ids=c(0, 0),
algorithm='meanfield',
eta=1,
adapt_engaged=F
)
```

Does the variational inference not use GPUs / is there a specific way to write matrix ops such that the GPU is used for them?

I’ve posted the full code here: [FR] Adding Gaussian Process Latent Variable Model Examples · Issue #442 · stan-dev/docs · GitHub