Promoting posteriordb into an official Stan project

I’m bit confused on what is the list of implementation details. Also “some of Aki’s comments” is not clear.

I think it would be useful to first discuss the different goals of the project and how they align with the needs of Stan project. The current use case scenarios is at https://github.com/MansMeg/posteriordb/blob/master/doc/use_cases.md. These use case descriptions have taken into account Stan developer comments we asked in discourse. I’m listing just the titles here

  • Testing
    • Testing implementations of inference algorithms with asymptotically decreasing bias and variance (such as MCMC)
    • Testing implementations of inference algorithms with asymptotic bias (such as Variational inference)
    • System testing
    • Performance testing
  • Efficiency comparisons of inference algorithms with asymptotically decreasing bias and variance
  • Explorative analysis of algorithms
  • Developing new algorithms for interesting models
  • Code examples

Based on this I would assume that the above use case list is ok?

But then this indicates that that the above use case list is not ok?

I agree that verifying the estimation of expectation values in a high-level independent way is one important use case, but we do list also other use cases. For example, regularly made performance testing of new Stan releases to check that there is no regression in the performance fits the core goals of Stan in my opinion. Also I think it’s useful that we include difficult posteriors for which we can’t currently get verified expectation values to push the algorithm development forward. We are tagging different posteriors, so that different use cases can easily pick a set of posteriors that are suitable for the specific use case (e.g. whether they have reference expectation values).

We list also explorative analysis that doesn’t need to be formal.

I’m not certain but I guess that “tools” mean other probabilistic programming frameworks. If so, that discussion is orthogonal to the list of goals above.

This is one of the reasons why posteriordb project started, to make it more explicit what we think is relevant and required for good comparisons. We have seriously considered making the posteriordb useful for other communities to improve how things are done, but we have also thought that the support for other frameworks should be limited and the focus would be in Stan. Do you object any collaboration with other probabilistic programming communities? If 95% of posteriordb is for Stan, is 5% a show stopper to call it a Stan project? Would you like that 5% to be moved to another package?

We are currently talking with others to align goals. The goals will not be exactly the same, but there is useful overlap. By talking to others we are influencing them to see also our point of view.

Yes. Although I would prefer to say: The official benchmarking used and recommended by the Stan project will be documented and implemented in posteriordb. This was the reason we started work on this. Currently Stan project recommendations are scattered and recommendations are less likely used if there is no easy to use software.

We’re happy to get help getting more details written of Stan project recommendations for each use case. The use cases we list have come up in discussions with Stan developers. One of the use cases is making GitHub - stan-dev/stat_comp_benchmarks: Benchmark Models for Evaluating Algorithm Accuracy more easy to expand and check that the current “preliminary empirical results” can be upgraded from preliminary (the models/posteriors in stat_comp_benchmarks have been use also in other use cases, but in other use cases it was even more important to have a wider set of models).

Can you specify which of the listed use cases you are worried to use diagnostic code provided by external projects? We do mention MCMC diagnostics in reference posterior page https://github.com/MansMeg/posteriordb/blob/master/doc/REFERENCE_POSTERIOR_DEFINITION.md which is based on also your comments, but there we don’t define how the diagnostics should be computed. The github repo main page illustrates the database part using posterior package diagnostic. posterior is a Stan project used by CmdStanR. For Python the related package is ArviZ which an external project used by CmdStanPy. ArviZ developers include some Stan developers and they are willing to have the same diagnostics in R and Python. Discussion of why CmdStanR and CmdStanPy are not using C++ implementations is worth it’s own thread.

Same reliance as for CmdStanR, CmdStanPy and PyStan, that is either posterior or ArviZ. So no extra dependencies.

4 Likes