Finally, we are starting to finalize a beta version of the posterior database, a storage of models, data, posterior and posterior draws (with information on how they were computed) to facilitate easier use of known Stan models for algorithm and diagnostic development, testing and teaching.
The main goal is to have all model code, data and so forth in one format for easy access both from Python and R that is extensively tested to work. The thought is also to be a simple way of communicating posteriors, i.e. just give a posterior name and everything can be accessed from posteriordb.
posterior is a json description binding things mentioned below together.
model is a model (currently only Stan, but hoepfully also other PPF in the future)
data and model defines the posterior
gold standard is a verified (almost) independent posterior draws computed with either dynamic HMC or analytically.
To access a posterior from R, just use
remotes::install_github("MansMeg/posteriordb", subdir = "rpackage/")
library(posteriordb)
eight_schools <- posterior("eight_schools")
# Access Stan code
stan_code(eight_schools)
# Get data
get_data(eight_schools)
# Get posterior draws (returns a draws object from the posterior R package)
gold_standard_draws(eight_schools)
or
pkpd <- posterior("pkpd")
stan_code(pkpd)
get_data(pkpd)
gold_standard_draws(pkpd) # No posterior has been computed yet though
For more info on details, see the github page here: https://github.com/MansMeg/posteriordb
Currently, the database has the Stan benchmark models by @betanalpha (except the simulation of multivariate normal draws), Bayesian linear regression, and a Latent Dirichlet Allocation model and data. The next goal is to try to fill it up with 200+ Stan example models and then just continue to fill it up.
The posteriors, data and models have keywords, so it is possible to select a specified set for testing, e.g. “stan_benchmark” for the set currently used to benchmark Stan, or “multimodal” if we know that the posterior distribution is multimodal. It is also possible access the dimension of the posterior, so it is possible to run tests only for models with more than 100 parameters.
We’ll add more functionality to make testing and summarizing the test results easier.
Although starting out with Stan, the idea is to allow other probabilistic programming benchmarks as well in the long run.
Tagging people who may be interested
- @betanalpha We have tried to add the benchmark models, but may have missed something when we tried to describe the models.
- @stevebronder for speed benchmark testing?
- @bbbales2 for adaptation testing?
- @yuling or our discussion on LDA model/data and model
- @breckbaldwin for making stan a reference. The next step is methods for comparing posteriors, but the rest is now done.
/MĂĄns