SBC StanConnect 2021

Final Schedule
Date: 8/31 9am to Noon (EST)

Talks
Graphical test for uniformity and its applications in SBC workflow (Teemu Säilynoja)
Prior Specification in the context of Simulation-Based Calibration (Paul Bürkner)
Workflow techniques for the robust use of Bayes factors (Daniel Schad)
Simulation-based calibration for Bayesian phylogenetics: dealing with huge models and an awkward parameter space (Luiz Max Carvalho)

Schedule (All times EST)
8:30am - 9:00am: Event opens, informal chat/networking
9:00am - 9:05am: Introduction to the Event by SGB representative
9:05am - 9:20am: Opening by Hyunji Moon and Andrew Gelman
9:20am - 10:00am: Teemu Säilynoja and Paul Buerkner’s talk [15-min. talk + 5 min. Q&A]
10:00am - 10:10am: Break, tutorial setup
10:10am - 11:10am: Tutorial (Martin Modrak, Shinyoung Kim)
11:10am - 11:50am: Daniel Schad and Luiz Max Carvalho’s talk [15-min. talk + 5 min. Q&A]
11:50am - 12:00am: Closing

Abstract for the talk:
Opening: Simulation-based data exploration

Prior Specification in the context of Simulation-Based Calibration

Performing simulation-based calibration (SBC) requires repeated sampling from the parameters’ priors and subsequently from the likelihood in its role as data generating distribution. Ideally, we have chosen our priors intelligently so that the resulting simulated data is within a reasonable range and similar in scale to our real-world data. However, in models where parameters are non-linearly related to the response, choosing priors that imply realistically looking data is actually quite hard. Or, to view it from another perspective, a lot of models with weakly-informative priors will imply data that are orders of magnitudes away from anything we would consider realistic. What is more, this may also have negative consequences on convergence and sampling efficiency in the subsequent model estimation. In my talk, I will illustrate these challenges, highlight some potential solutions and point to directions for future research.

Graphical test for uniformity and its applications in SBC workflow

Assessing the uniformity of the rank statistics of the prior draws is a central part of SBC; histogram and empirical CDF are tools used in the original SBC paper. Unfortunately histogram doesn’t take into account the dependency between bin heights and users have to choose the number of bins. Also, Comparing empirical CDF of rank statistics with that of random draws from uniform distribution is suggested.
In our paper, we provide simultaneous confidence bands for the sample ECDF which results in an intuitive graphical test for uniformity. The graphical nature of this test also provides feedback on the nature of the possible deviations from uniformity. Optimization and a simulation based method for adjusting the pointwise confidence bands to obtain simultaneous coverage with a desired type 1 error rate are also presented. In my talk, I briefly introduce our graphical test and demonstrate the test together with the sbc function of rstan can be applied to recognize common deviations from uniformity. I also briefly introduce the other main contribution of our paper which, by extending the simultaneous confidence bands to multiple sample comparison, allows for evaluating whether two or more samples originate from the same underlying distribution. This is especially useful as an alternative for the widely used trace plots and rank plots in assessing the convergence of MCMC chains.

Workflow techniques for the robust use of Bayes factors

It is unknown whether approximate Bayes factor estimates (e.g., using bridge sampling) are unbiased for complex analyses. We use simulation-based calibration as a tool to test the accuracy of Bayes factor estimates. Moreover, we study how Bayes factors misbehave under different conditions and suggest a workflow for the use of Bayes factors.

Simulation-based calibration for Bayesian phylogenetics: dealing with huge models and an awkward parameter space

Phylodynamics applies phylogenetic methods to study the evolutionary and epidemiological dynamics of pathogens and uncover the spatiotemporal patterns for the spread of viruses and bacteria. However, phylogenetic models are highly intractable, which requires the use of approximate sampling methods. In this setting, SBC could be employed to test and calibrate the approximation algorithms. Phylogenetics poses special difficulties to SBC for two main reasons: (i) it includes both discrete and continuous components (ii) there is no canonical representation of trees with well-ordering, and therefore requires a proper projection onto metric spaces for rank computation. In this talk, the main statistical issues in phylogenetic analysis will be discussed with a focus on SBC. Automated analysis from JAVA application and its integration with other packages for further analyses such as plotting will be shown. Joint work with Remco Bouckaert (Auckland).

Thanks, co-organizers and speakers!
@andrewgelman @martinmodrak @Dashadower @paul.buerkner @maxbiostat

10 Likes