Prediction Intervals

Using posterior_predict, I have obtained a ppd object from newdata.

Using bayesplot mcmc_intervals, I can estimate the credible intervals for each set of values in new data.

Are the boundaries of the credible interval a description of possible values of a new observation? That is, are they a prediction interval?


Since nobody else answered, here is my take, but I am not 100% sure it is correct. Maybe @lauren can check my reasoning?

posterior_predict should give you something like prediction intervals under the condition your model is exactly correct. For a linear model, this means assuming the relationship is actually linear and determined exactly by the covariates you decided to include AND that all parameters were actually drawn from the priors you have specified. Note also that under this condition, the 95% credible interval for future observations will contain 95% of future observations if you average over possible experiments. For any single dealized (matching your model) experiment, the credible interval for future observations will almost certainly be systematically off (wider/narrower/biased) from actual future observations as it includes uncertainty in parameters, but the there is only one “true” value of parameters. Further in any real scenarios, the credible interval will miss predictions because of model mismatch to the real process.

Does that make sense?

1 Like

@jonah and I talk about this from time to time.

It’s kind of a funny thing because in frequentist statistics more often than not you see confidence intervals reported. Prediction intervals seem much less common so it’s surprising that Bayesian inference is (at least as far as I know) entirely prediction intervals.

Rob Hyndman has a nice blog post elaborating why that is (, but the cliff’s notes are that once the parameter is modeled as a random variable, then inherent meaning of a confidence interval (i.e., 95% of all 95% confidence intervals contain the true parameter value) doesn’t make sense.

Some people (including me!) do sometimes look a frequentist properties of Bayesian intervals. That can be contentious but I view it as calibration of methods in a very practical sense.


Yeah, when you call posterior_predict() you obtain draws from the posterior predictive distribution, which is precisely the distribution of new data according to your model (like @martinmodrak said) conditioned on the data you have. Interval summaries of these draws are estimates of what we would call Bayesian prediction or predictive intervals. Regardless of the terminology, those intervals summarize what your model thinks about new data given what it currently knows (and assumes).

Is that what you were interested in or were you specifically referring to the difference between frequentist and Bayesian approaches to prediction?


I find that it helps to be be super explicit about the definition of calibration to avoid as much confusion as possible when discussing all the intervals floating around.

As already noted by @lauren and @jonah a posterior interval and posterior predictive interval are just convenient summaries of a probability distribution, the posterior distribution and posterior predictive distribution respectively. Even under ideal circumstances where your model is correct those distributions have no guarantee to have any relation to the true model configuration (in the case of posterior intervals over the parameter space) or future data (in the case of posterior predictive intervals over the observational space). This contrasts to frequentist intervals which are constructed to have certain coverage properties, at least under idealized assumptions.

That said there’s no reason why we can’t ourselves calibrate the posterior and posterior predictive intervals to see how well they do cover a certain value. A pure frequentist calibration is challenging because you have to scan through all of the parameter values (even those near infinity), simulate data and see how often the resulting intervals cover a desired value, and then report the worst coverage for all of those parameter values (for more see for example Probabilistic Modeling and Statistical Inference, especially Secton 2.2). We can also, however, consider a Bayesian calibration that just requires computing coverage with respect to those model configurations supported by prior distribution. For more see Probabilistic Modeling and Statistical Inference.

I very much appreciate the comments from martinmodrak, lauren, jonah, and betanalpha that included links to more extensive discussion.
I have previously only used frequentist methods; my model discussions would always include CIs and sometimes PIs.
As I move into Bayesian models I wanted to clarify my use of terminology, specifically prediction intervals. That was the reason for my question.
I accept that all models are incorrect.
I understand the strong reservations that are expressed about posterior prediction intervals.
These reservations would seem to discourage the use of posterior prediction intervals
My next question is about your actual reporting practice.
Do each of you include posterior predictions in reporting your models?
I wish to follow best practices.


1 Like

I don’t have have strong reservations about about uncertainty intervals! I love people reporting their uncertainty, and intuitive uncertainty is one of the things the Bayesian stuff does well imo.

They are prediction intervals and can be interpreted as such. People call them credible intervals or uncertainty intervals to differentiate between frequentist prediction intervals. They are absolutely reported all the time. Sometimes I plot the whole posterior, sometimes that doesn’t work with what I’m doing so I use intervals (generally 5-95 or 10-90) You might want to look into prior and posterior predictive checks - a related concept that can be used to investigate your model and understand the role of priors. :)

All of the model is wrong stuff is also true of frequentist statistics.


I don’t think any of us meant to give that impression, just being explicit about what they mean and don’t mean. Predictive distributions and interval estimate summaries of them are perhaps the most important tool of Bayesian inference. Without them we have nothing to say regarding what our model thinks about observables in the world (parameters don’t actually exist). In other words, we should often care about the predictive distribution as much as or even more than the posterior distribution of parameters (the latter is just required to obtain the former). So it’s indeed common to use and report about the posterior predictive distribution. Hope that helps clarify.


Just like @lauren and @jonah and say, posterior predictive intervals are very useful when trying to communicate the predictive consequences of your model. I personally like showing multiple, nested intervals to communicate more structure. We’re just saying that you have to be careful to interpret those intervals carefully – without any explicit calibration they don’t have any guarantees to cover “true values” with any regularity. They’re just your model’s best understanding of the problem, and hence only as good as your model.