Request for help: Looking for large models to test with Stan GPU support

rok_cesnovar · January 4, 2021, 9:55am

Hi all,

the next release of Stan should have an expanded Stan OpenCL and thus GPU support. We are currently looking for more real world models that we can test and evaluate the performance of the new backend with.

We are looking for models that:

use one of the below listed distributions with somewhat large inputs (size of vector or array > 5000)
take a considerable amount of time to fit (at least an hour or more)
you can can share the model and data for, even if only via e-mail and not on the forums.

List of supported lpdf/lpmf functions:

bernoulli_lpmf, bernoulli_logit_lpmf, bernoulli_logit_glm_lpmf
beta_lpdf, beta_proportion_lpdf
binomial_lpmf
categorical_logit_glm_lpmf
cauchy_lpdf
chi_square_lpdf
double_exponential_lpdf
exp_mod_normal_lpdf
exponential_lpdf
frechet_lpdf
gamma_lpdf
gumbel_lpdf
inv_chi_square_lpdf
inv_gamma_lpdf
logistic_lpdf
lognormal_lpdf
neg_binomial_lpmf, neg_binomial_2_lpmf, neg_binomial_2_log_lpmf, neg_binomial_2_log_glm_lpmf
normal_lpdf, normal_id_glm_lpdf
ordered_logistic_glm_lpmf
pareto_lpdf, pareto_type_2_lpdf
poisson_lpmf, poisson_log_lpmf, poisson_log_glm_lpmf
rayleigh_lpdf
scaled_inv_chi_square_lpdf
skew_normal_lpdf
student_t_lpdf
uniform_lpdf
weibull_lpdf

Thank you!

To give you a taste of what is to come:

A very simple model with the binomial distribution:

data {
  int N;
  int y[N];
  int x[N];
  vector[N] w;
}
parameters {
  vector[2] beta;
}
model {
  beta ~ normal(0,1);
  y ~ binomial(x , beta[1] + beta[2] * w);
}

is faster on a GPU for N > 10k for a single MCMC chain and for large N, fitting using a GPU is up to 60 times faster (tested using AMD Radeon VII and and i7 CPU. The speedup also increases for multiple chains.

Cynon · January 4, 2021, 10:32am

Thanks Rok,

Happy to supply data/models on the issues I was working on in my other post? My models are taking days to run and I use several tricks to improve the performance.

I can supply via email if that interests you?

rok_cesnovar · January 4, 2021, 10:36am

That would be great. Thank you!

mike-lawrence · January 4, 2021, 4:35pm

Here’s code to simulate and fit hierarchical data. It uses my reduced-redundant-computation trick, but there’s still a final call to normal() at the end that has lots of input if you choose large values for any of the data-simulation parameters at the top of the R script. Increasing num_trials should have the most targeted impact on that final likelihood call; increasing the others will increase the input to the likelihood but will also increase the amount of computation that has to happen before the likelihood.

hwg_fast.r (8.5 KB) hwg_fast.stan (3.4 KB) helper_functions.r (5.0 KB)

mike-lawrence · January 4, 2021, 4:48pm

If you find that one useful, I can also create a version that has a binomial outcome instead of normal.

Neil_F · January 5, 2021, 2:03pm

Rok,
I’m happy to supply data and a model by email. The data has some sensitivities so can’t be made Public but can be shared to a limited extent. Model run time is 5.5 hours for 500 iterations.

rok_cesnovar · January 5, 2021, 2:15pm

Thanks! My email is rok.cesnovar at fri.uni-lj.si

kholsinger · January 6, 2021, 7:08pm

Do you need an example that converges cleanly? I’m fighting with a model that has severe identification problems with one data set (but not with others). The one with severe identification problems can run for several days to get decent bulk and tail ESS, but even with adapt_delta at 0.99, I get a lot of divergences. Another data set using the same code runs reasonably well, both a lot faster and few or no divergences.

Kent

rok_cesnovar · January 6, 2021, 7:17pm

No, even those that do not are more than welcome.

kholsinger · January 6, 2021, 7:25pm

Great! I’ll send a link to a Github repository for an in-progress R package using RStan. The repository includes two data files that work well. The misbehaving dataset has not yet been published. I’ll send you the Github link and the misbehaving dataset via a message through Discourse later today.

Kent

Topic		Replies	Views
Ordered_logistic_glm_lpmf() availability and documentation? General gpu , stan	1	452	June 18, 2022
Trying to understand _glm_lp*f functions in Stan Modeling techniques	8	718	July 6, 2020
Generalized Beta of the second kind (GB2) Modeling specification	5	758	November 7, 2023
GP Regression Example with Stan on GPU CmdStan stanc	1	743	February 9, 2020
Errors while trying to write joint likelihood of a normal_lpdf and a bernoulli_lpmf in Stan Modeling	6	658	March 19, 2018

Request for help: Looking for large models to test with Stan GPU support

Related topics