Posteriordb, beta version 0.2

mans_magnusson · September 21, 2020, 1:47pm

Hi all,

Now we have, finally, bumped posteriordb to the next beta version 0.2. Since 0.1, the main focus has been to include all the very good comments we got last time. A good presentation of the new posteriordb by @avehtari can be found here:

One of the main focus has been to simplify for the Stan community to contribute new stan posteriors. Here is some information on how to add a posterior using R:

github.com

stan-dev/posteriordb/blob/master/doc/CONTRIBUTING.md

Contributing
============

We are happy for you to contribute with code or new posteriors to the database. Create a Pull Request (PR) with your new content to the posteriordb; the content will be checked using Github Actions.

Don't hesitate to make a PR with a new model, data, or posterior to the repository. We use the PR for discussion on new material!

Copyright
-------------
All models supplied will use the BSD3 license by default. Specifying an alternative open-source license for data, model or reference posterior object is possible.


Pull request workflow
-------------

1. Fork this repository and clone the repository
1. Contribute and add data/models/posteriors to the local (cloned) posteriordb
1. Commit the contribution and push to your fork
1. Open a pull request (the tests of the contribution will automatically run on Github Action)

This file has been truncated. show original

Besides, we would be very happy for any suggestions, comments, and thoughts on how this could be improved to be of use to the community. The next two steps we are currently discussing are:

Adding more probabilistic programming frameworks (PyMC3, Tensorflow Probability, and Pyro) with tests that the models are identical
Including posterior expectations with their MCSE.

With kind regards
Måns

stevebronder · September 22, 2020, 5:01pm

Nice! One thing that might also be nice for this. For some of the models that go off in terms of 10ths of a second their timing stability can be unreliable. For example the garch model in our performance benchmarks happens in less than a quarter of a second and can be +/- 20% for any individual run. It could be nice to do something like google benchmark does where they use a little heuristic to deduce how many times to run a benchmark in order to get a stable time estimate. Even for things like the SIR model that take a few minutes to run you probably want to run it 30 times or so.

mans_magnusson · September 22, 2020, 9:34pm

I agree, for timing purposes many models needs to be run multiple times. The question is to decide which ones and for how long. I could run it on my machine and get a rough idea of the speed. Although that is probably not a good long term solution if people start to supply posteriors. Any thoughts?

stevebronder · September 23, 2020, 3:14am

Oh woof, I’m not sure! The code I posted above has some of what google uses. I think you essentially want a hard limit for each model’s max time and number of iters it can run (say maybe like 15 minutes and like 30K iterations)? There’s def way more complicated things you can do but I think some simple’ish heuristic would work.

Topic		Replies	Views
Posteriordb, version 0.3 Developers	1	489	September 22, 2021
Posteriordb v 1.0.0 released General	1	266	May 5, 2025
Posteriordb 0.5 out Modeling	1	279	November 6, 2023
Beta-release Bayesian Posterior Database Publicity	21	2308	December 14, 2019
Request for comments: Bayesian Posterior Database Developers pystan , rstan	7	784	August 16, 2019

Posteriordb, beta version 0.2

Related topics