Request for comments: Bayesian Posterior Database

Hi everyone!

Me, @avehtari, @paul.buerkner and Eero Linna has put together the first draft of a posterior database. The idea is to collect posterior distributions (i.e. data and models) and potential gold standard posterior samples in a database for easy access. Now it only contains a few examples, but we hope that if we set the structure and general idea, it can be relatively quick to add new posteriors, models and data. The repository README should contain all necessary information.

All comments are welcomed!

Hopefully, we are also soon done with bayesbenchr an R package that can be used to evaluate different diagnostics and inference methods for the posteriors in the posterior database.

8 Likes

Hi, sounds great!

ArviZ would be interested to implement a wrapper for this.

We currently have some simple example posteriors so users can start to play with them.

import arviz as az
idata = az.load_arviz_data("centered_eight")
2 Likes

It would be interesting to also have the following

posterior (default)
posterior predictive
prior
prior predictive
elementwise loglikelihoods

Thats great!

Is there some way we could build a python package that would fit these needs/enable it to use in arviz? I know Eero is working with a python package. So any suggestions there? I think we would like to have the python and R api quite similar so you could look at the R api and see if there are something you miss or would like to change.

Your suggestions on priors, likelihoods, predictives etc is somthing we have discussed and as you mention those would be valuable to have. The question is how to include those. We can get it from the stancode currently. So then the question is if we should include them as R and Python functions.

/Måns

Cool idea! I’ve seen something similar floated in the conclusion of https://arxiv.org/abs/1904.04484, and I’m wondering if you have any thoughts about:

  • How precise should the submitted posteriors be? What are the precision metrics? (minimum tail ESS?)
  • Given the substantial computational requirement for some posterior distributions, is it sensible to also store (potentially generative) approximations to the posteriors based on the “gold standard samples”? These approximations could be used to cheaply generate an arbitrary number of samples for the purpose of making figures look good. I guess one can sample from the posterior predictive distribution (PPD) an arbitrary number of times given a fixed sample from the posterior, but what if generating from the PPD is computationally expensive? (Perhaps not the most likely of scenarios)
  • What kind of metadata are you interested in collecting? (Sampler type / adaptation diagnostics / tuning parameters?) It would be interesting to know the computational time / cost / environment used to achieve the archived samples. This is definitely off topic (more related to ML than MCMC-based statistics) but it would be interesting to collect this information to estimate the energy usage in a manner similar to: https://arxiv.org/abs/1906.02243. I’ve seen a few models that have substantial compute requirements and subsequently churn through a lot of cluster time & credits (See Section 4.2 of https://projecteuclid.org/euclid.aoas/1560758424 for example) .

These things are probably beyond the scope of your current interests, but I’d be interested to know your collective thoughts on them. I really do like the idea of reusing posterior distributions in future analyses :).

1 Like

The simplest thing would be to give the results somehow as a dictionary (e.g. json). That can then easily be transformed to InferenceData with az.from_dict.

We use xarray.dataset for each group.

Practically independent draws obtained by long chains with no divergences thinned to desired number of draws

Yes