I wish to bounce off some ideas on Bayesian workflow. GPS provides the raw data about a location, navigation systems use this data to guide users to their destinations. Diagnostics on Bayesian workflow allows construction of GPS in model space. How can the typifying paths idea (below) be improved to build good navigation system that provides each modeler localized decision support? I’m curious about relevant literature (e.g. Bayesian workflow casestudies with real data & specific application) that can help me flesh out navigation idea to provide documentation complementing SBC package vignettes e.g. Small model implementation workflow • SBC. Looking forward to learn from any inconsistencies between our internal world models!
As a modeler explores model space with Bayesian workflow map, which among joint probability p(y,\theta) (P), approximator p_A(y|\theta) (A), data \tilde{y} (D) can he/she decide to update in which sequence based on what signal? Definition and example of P, A, D are in @paul.buerkner, @scholz, Stefan Radev’s paper (which I highly recommend!) and for simplicity I divided this journey into decision node and paths.
Basic idea is a modeler’s journey is a sequential decision problem where signal at each time is high dimension diagnostics and actions are updating P or A or D. This naturally led to the question: can we typify paths to reach the state of good enough diagnostics? For instance, building on Birthday problem, a time series model for numbers of birth per day, let’s say a modeler start with a simple slow trend model (P) with optimization approximation algorithm (A) with birthday per day in USA from 1969-1988. After some modeling, he/she retrieved[1] good enough community-recommended diagnostics (prior predictive check, simulation based calibration, posterior predictive check) using a time series model with long term, seasonal, weekly, day of year, and special floating day variation (P’), Markov chain approximation algorithm (A’), birthdays per year in USA from 1969-2018. However, there exists at least 20 paths (excluding give-up at each decision) that this modeler could have gone though.
3 types of decision
Q1. prior predictive check diagnostics?
- bad: update P
- good enough: go to step 2
Q2. simulation-based calibration diagnostics?
- bad: update P or A
- good enough: go to step 3
Q3. posterior predictive check diagnostics?
- bad: update P or A or D
- good enough: finish
good enough prior predictive check diagnostics (ppc1) := range of simulated observed data based on prior is not extreme: irrejectable.
good enough simulation based calibration (sbc) diagnostics:= ecdf-based plots is within confidence band with alpha for every test quantities (t))
good enough posterior predictive check diagnostics (ppc2) := range of simulated based on posterior is acceptable for every quantities of interest (observed data, utility)
Subjectivity in “good enough” diagnostics makes this a decision problem. Bars would be high in pharmaceutical and defense industry where human’s life is at stake. And even in the same industry, company with enough resource (computation, time, human capital) would have higher bars, and even in the same company, bar can be dragged down as project deadline approaches.
>20 types of path
- bad ppc1
- 1b_P (1): update P to retrieve good enough ppc1 diagnostics
- good enough ppc1, bad sbc
-
1g2b_P, 1g2b_A (2): update P or A to retrieve good enough sbc
-
1g2b_PA, 1g2b_AP (2): update either P or A gives bad sbc, update P-A or A-P to retrieve good enough sbc
- good enough ppc1 and sbc, bad ppc2
-
1g2g3b_P, 1g2g3b_A, 1g2g3g_D (3): update P or A or D to retrieve good enough ppc2
-
1g2g3b_PA, 1g2g3b_AP, 1g2g3b_PD, 1g2g3b_DP, 1g2g3b_AD, 1g2g3b_DA (6): bad ppc2 after updating one, retrieve good enough ppc2 after updating two
-
1g2g3b_PAD, 1g2g3b_PDA, 1g2g3b_ADP, 1g2g3b_APD, 1g2g3b_DAP, 1g2g3b_DPA (6): bad ppc2 even after updating two, retrieve good enough ppc2 after updating all three
tagging some workflow enthusiasts! @andrewgelman @avehtari @Bob_Carpenter @betanalpha @martinmodrak @paul.buerkner @spinkney @mike-lawrence
Thank you.
retrieve in the sense that default is good enough diagnostics and learning happens in the process of resolving inconsistencies between internal world model (P), tools to implement models (A), sensed external world (D). ↩︎