the next release of Stan should have an expanded Stan OpenCL and thus GPU support. We are currently looking for more real world models that we can test and evaluate the performance of the new backend with.

We are looking for models that:

use one of the below listed distributions with somewhat large inputs (size of vector or array > 5000)

take a considerable amount of time to fit (at least an hour or more)

you can can share the model and data for, even if only via e-mail and not on the forums.

A very simple model with the binomial distribution:

data {
int N;
int y[N];
int x[N];
vector[N] w;
}
parameters {
vector[2] beta;
}
model {
beta ~ normal(0,1);
y ~ binomial(x , beta[1] + beta[2] * w);
}

is faster on a GPU for N > 10k for a single MCMC chain and for large N, fitting using a GPU is up to 60 times faster (tested using AMD Radeon VII and and i7 CPU. The speedup also increases for multiple chains.

Happy to supply data/models on the issues I was working on in my other post? My models are taking days to run and I use several tricks to improve the performance.

Here’s code to simulate and fit hierarchical data. It uses my reduced-redundant-computation trick, but there’s still a final call to normal() at the end that has lots of input if you choose large values for any of the data-simulation parameters at the top of the R script. Increasing num_trials should have the most targeted impact on that final likelihood call; increasing the others will increase the input to the likelihood but will also increase the amount of computation that has to happen before the likelihood.

Rok,
I’m happy to supply data and a model by email. The data has some sensitivities so can’t be made Public but can be shared to a limited extent. Model run time is 5.5 hours for 500 iterations.

Do you need an example that converges cleanly? I’m fighting with a model that has severe identification problems with one data set (but not with others). The one with severe identification problems can run for several days to get decent bulk and tail ESS, but even with adapt_delta at 0.99, I get a lot of divergences. Another data set using the same code runs reasonably well, both a lot faster and few or no divergences.

Great! I’ll send a link to a Github repository for an in-progress R package using RStan. The repository includes two data files that work well. The misbehaving dataset has not yet been published. I’ll send you the Github link and the misbehaving dataset via a message through Discourse later today.