SBC StanConnect 2021

hyunji.moon · April 9, 2021, 4:43pm

Hello,
@martinmodrak and I would like to organize a StanConnect session on Simulation-based calibration (SBC).

SBC is relatively young but a versatile diagnosis tool that could calibrate mathematical models (prior, observational) and computational algorithms. In this sense, we wish to design the session as interactive as possible so please react to this post regarding any of the following.

speaker/poster
discussion topic suggestion
what you wish to learn from the tutorial
would like to attend

For 1, it would be great if you could suggest an abstract.
For 2, some example topics could be SBC computation speedup techniques, SBC for hierarchical model etc.
For 3, @Dashadower and I are planning a tutorial with this package (tentative), and would be happy to tailor to the needs of the audience. Please refer to README for a brief SBC introduction with its use cases and references.

Tentative date would be somewhere between end of June to August (edit).

Thanks,
Hyunji

jsocolar · April 12, 2021, 1:51am

I would love to learn more about options for SBC for computationally expensive models/datasets. In the context of big data and a complex model, are there strategies for designing a population of smaller datasets and/or simpler models to (sorta kinda) validate the full computation?

asael_am · April 13, 2021, 9:18pm

4 please :)

JMeekes · April 14, 2021, 8:23am

Very interesting proposal!

I vote with Jacob but I would add that I would also explicitly like to relate this to point 2 in the readme (i.e., how does this help us apply approximation algorithms to complex models?).
I’m mostly interested in how SBC can be used to speed up model development and computation.
Tentative (depends on timing relative to moving house etc.)

maxbiostat · April 18, 2021, 11:31pm

I could talk about SBC in phylogenetics, which is work joint work with Remco Bouckaert and which I’ve also discussed with @betanalpha. Very preliminary stuff. The downside is: there’s no Stan involved, although I’d be keen to learn about plotting/analysis routines that could be adapted.

Christopher-Peterson · April 18, 2021, 11:58pm

I’d be keen to attend & learn more about SBC.

storopoli · April 19, 2021, 8:17am

I would also be keen to attend and learn more!

PhilClemson · April 19, 2021, 8:45pm

I still feel like a novice when it comes to SBC so I would be interested in any tutorials on both the theory and implementation. On the other hand, it would also be useful for showcasing specific applications of SBC (happy to make a poster or short presentation on my own usage, for example).

hyunji.moon · May 25, 2021, 4:58pm

Final Schedule
Date: 8/31 9am to Noon (EST)

Talks
Graphical test for uniformity and its applications in SBC workflow (Teemu Säilynoja)
Prior Specification in the context of Simulation-Based Calibration (Paul Bürkner)
Workflow techniques for the robust use of Bayes factors (Daniel Schad)
Simulation-based calibration for Bayesian phylogenetics: dealing with huge models and an awkward parameter space (Luiz Max Carvalho)

Schedule (All times EST)
8:30am - 9:00am: Event opens, informal chat/networking
9:00am - 9:05am: Introduction to the Event by SGB representative
9:05am - 9:20am: Opening by Hyunji Moon and Andrew Gelman
9:20am - 10:00am: Teemu Säilynoja and Paul Buerkner’s talk [15-min. talk + 5 min. Q&A]
10:00am - 10:10am: Break, tutorial setup
10:10am - 11:10am: Tutorial (Martin Modrak, Shinyoung Kim)
11:10am - 11:50am: Daniel Schad and Luiz Max Carvalho’s talk [15-min. talk + 5 min. Q&A]
11:50am - 12:00am: Closing

Abstract for the talk:
Opening: Simulation-based data exploration

Prior Specification in the context of Simulation-Based Calibration

Performing simulation-based calibration (SBC) requires repeated sampling from the parameters’ priors and subsequently from the likelihood in its role as data generating distribution. Ideally, we have chosen our priors intelligently so that the resulting simulated data is within a reasonable range and similar in scale to our real-world data. However, in models where parameters are non-linearly related to the response, choosing priors that imply realistically looking data is actually quite hard. Or, to view it from another perspective, a lot of models with weakly-informative priors will imply data that are orders of magnitudes away from anything we would consider realistic. What is more, this may also have negative consequences on convergence and sampling efficiency in the subsequent model estimation. In my talk, I will illustrate these challenges, highlight some potential solutions and point to directions for future research.

Graphical test for uniformity and its applications in SBC workflow

Assessing the uniformity of the rank statistics of the prior draws is a central part of SBC; histogram and empirical CDF are tools used in the original SBC paper. Unfortunately histogram doesn’t take into account the dependency between bin heights and users have to choose the number of bins. Also, Comparing empirical CDF of rank statistics with that of random draws from uniform distribution is suggested.
In our paper, we provide simultaneous confidence bands for the sample ECDF which results in an intuitive graphical test for uniformity. The graphical nature of this test also provides feedback on the nature of the possible deviations from uniformity. Optimization and a simulation based method for adjusting the pointwise confidence bands to obtain simultaneous coverage with a desired type 1 error rate are also presented. In my talk, I briefly introduce our graphical test and demonstrate the test together with the sbc function of rstan can be applied to recognize common deviations from uniformity. I also briefly introduce the other main contribution of our paper which, by extending the simultaneous confidence bands to multiple sample comparison, allows for evaluating whether two or more samples originate from the same underlying distribution. This is especially useful as an alternative for the widely used trace plots and rank plots in assessing the convergence of MCMC chains.

Workflow techniques for the robust use of Bayes factors

It is unknown whether approximate Bayes factor estimates (e.g., using bridge sampling) are unbiased for complex analyses. We use simulation-based calibration as a tool to test the accuracy of Bayes factor estimates. Moreover, we study how Bayes factors misbehave under different conditions and suggest a workflow for the use of Bayes factors.

Simulation-based calibration for Bayesian phylogenetics: dealing with huge models and an awkward parameter space

Phylodynamics applies phylogenetic methods to study the evolutionary and epidemiological dynamics of pathogens and uncover the spatiotemporal patterns for the spread of viruses and bacteria. However, phylogenetic models are highly intractable, which requires the use of approximate sampling methods. In this setting, SBC could be employed to test and calibrate the approximation algorithms. Phylogenetics poses special difficulties to SBC for two main reasons: (i) it includes both discrete and continuous components (ii) there is no canonical representation of trees with well-ordering, and therefore requires a proper projection onto metric spaces for rank computation. In this talk, the main statistical issues in phylogenetic analysis will be discussed with a focus on SBC. Automated analysis from JAVA application and its integration with other packages for further analyses such as plotting will be shown. Joint work with Remco Bouckaert (Auckland).

Thanks, co-organizers and speakers!
@andrewgelman @martinmodrak @Dashadower @paul.buerkner @maxbiostat

hyunji.moon · July 2, 2021, 3:49pm

The following is the list of support for SBC. For those curious about model checking, prior knowledge on SBC before the conference would never hurt :) Detailed background and FAQ is documented in SBC package readme. Please let me know through reply if there is any missing literature!

Theoretical support

Validating Bayesian Inference Algorithms with Simulation-Based Calibration Talts, Betancourt, Simpson, Vehtari, Gelman, 2018
Graphical Test for Discrete Uniformity and its Applications in Goodness of Fit Evaluation and Multiple Sample Comparison Säilynoja, Bürkner, Vehtari, 2021
Toward a principled Bayesian workflow in cognitive science Schad, Betancourt, Vasishth, 2021
Bayes factor workflow Schad, Nicenboim, Bürkner, Betancourt, Vasishth, 2021

Application support

Vignette

ECDF with codes (new implementation by Teemu Säilynoja will be available in bayesplot and SBC package soon)

hyunji.moon · August 20, 2021, 2:43pm

This is the link for registering StanConnect on August, 31.

Our staff has put great efforts into developing a package and tutorials to introduce ABCs of SBC which could be easily extended to model checking in Bayesian workflow. The followings are ongoing documentation and all feedback is welcome.

Bayesian Calibration Series 1 blog
SBC FAQ wiki (early draft)

Especially I need help from the Stan community on SBC FAQ. Please feel free to ask any questions on SBC and contribute to FAQ list.

The motivation behind FAQ is SBC’s consistent evolution. Delightful though the development is, changing diagnostics leads to confusion and I received several of questions on the newest SBC version and the reasoning behind its update. Multiple factors such as autocorrelation, interpretation, power and simulation scale, calibration target should be considered for the best use of SBC. There is no one-size-fits answer, but as a person who appreciates the value of prior recommendations in Stan wiki, I thought timely updated recommendations based on the existing literature, communication with SBC frontiers, and my first-hand experiment and research could be helpful.

Raoul-Kima · September 10, 2021, 3:09pm

I couldn’t attend, but would have liked to. Are / will there be recordings of the talks?

I guess that Info should be somewhere, but I couldn’t find it on the events page, nor did i see any general post about StanConnects that explains this in the forum under “events”.

(It doesn’t seem possible to enter the eventbrite event anymore.)

hyunji.moon · September 10, 2021, 6:25pm

Hi @Raoul-Kima I will upload it by this week and notify you through this thread.

hyunji.moon · September 10, 2021, 9:20pm

Meanwhile, for those who requested recording for the tutorial, the SBC’s basic structure and usage could be seen from here: SBC Interface Introduction • SBC

Raoul-Kima · September 11, 2021, 1:00pm

Great, Thanks!

hyunji.moon · September 13, 2021, 11:36pm

Video of the conference is uploaded: http://y2u.be/SbgAMkN18dA

Topic		Replies	Views
Feedback to calibrate SBC session plan Publicity simulation-based-calibration , stanconnect	2	708	April 19, 2021
Using narrower priors for SBC General prior-choice , simulation-based-calibration	14	1553	August 6, 2021
Rewriting example models with SBC Developers simulation-based-calibration	8	1365	February 25, 2019
Simulation-based calibration case study with RStan Algorithms simulation-based-calibration	31	3203	July 30, 2019
SBC for cmdstanr General cmdstanr , simulation-based-calibration	22	1597	December 9, 2020

SBC StanConnect 2021

Related topics