[CFP] Scipy 2018 - Scientific Computing with Python conference


Thank you, @ahartikainen!


Hi everyone,

@marianne and I plan to submit a short paper at the Scipy proceedings. I have written a first draft on the Bayesian workflow chapter (attached). My target audience is scientists who want to be introduced to Bayesian statistics. I would appreciate any feedback, especially if you spot any inaccuracies.

Thank you in advance for your time -
And thanks @ahartikainen for the mention!

bayesian_workflow.pdf (101.2 KB)


We generate the fake data from the model, not to look like the real data. That’s the way to check that the model implementation is working.

Then we want to use PPCs to check that the model fit is good to the real data.

Bayesian modeling supports lots of different kinds of inference—full Bayes, variational inference, point estimation, etc. It’s not a kind of inference.

I wouldn’t say “what values” and “what ranges of values” are different questions.

Andrew goes over the methodology quite nicely in the very first chapter of BDA, and Betancourt’s workflow case studies are useful for the hands on.

The discussion on priors should be updated to follow the best practices advice laid out in our Wiki page on priors. Unlike the early chapters of BDA, we do not want to be encouraging people to use vague or improper priors (which do require proper posteriors). We also don’t do model selection with Bayes factors, so priors being spot on isn’t an issue.

I’d stress that the prior and likelihood make up the model together and both are chosen subjectively by the researcher. For example, the researcher may include or exclude a predictor in the likelihood or pool or not pool coefficients, before even considering a prior.

They posteriors are not confidence intervals and we don’t want them to contain the true values. If we take 50% posterior intervals, we want 50% of them to contain the true values if everything’s calibrated.

The thing you really need to stress is posterior predictive checking, which is where you see if the model you came up with is well-specified for the data. Calibration is only guaranteed if the model is well specified.


Thank you for the feedback - it’s very helpful. I agree that PPC and model evaluation was missing from the first draft.


Here is a link to the talk: