Uncertainty around central tendency - Bayes SE vs. Bayesin CI

Hi All,

If I understand previous posts correctly, one way to define the posterior SE for parameter means is
mean -/+ mean_se*qnorm(0.95)*sqrt(N_eff)

for an approximation of a 90% confidence interval.

But, that looks very different than looking at the 5% and 95% quantiles. If we want to express uncertainty more similar to frequentist methods by design, for example to derive a Bayesian MoE (Margin of Error) is the first approach ever justified?


1 Like

In Bayesian inference we have a single posterior distribution quantifying uncertainty in our model. The information in that distribution can be extracted with expectation values, such as means or variances or quantiles. There is no canonical “standard error”.

In practice, however, we cannot compute those expectations exactly and we have to resort to approximate algorithms like Markov chain Monte Carlo. When Markov chain Monte Carlo is well-behaved is satisfies a central limit theorem that states

hat{f} - E[f] ~ normal(0, Var[f] / ESS)

where sqrt{Var[f] / ESS} is the Markov chain Monte Carlo standard error. So standard errors and effective sample sizes arise not in describing the posterior but rather in how accurately we can compute descriptions of the posterior.

I think that answer is way too smart for me.

Let me try and see if I understand correctly: Are you saying that the Markov Chain Monte Carolo SE is not appropriate in describing uncertainty around central tendencies in the posterior? Meaning that, if I want to express uncertainty on model predictions, I should always look at quantiles of posteriors of parameters, or sample the predictive value (i.e, additive combination in a linear model) directly and read off uncertainty from quantiles, rather than relying on the Markov Chain Monte Carlo SE in any way?

Thanks again,

Correct. Of course you could use other measures of dispersion, such as posterior mean +/- posterior standard deviation.

The MCMC SE comes into play only when you want to check that you are estimating the means/standard deviations/quantiles sufficiently accurately. For example, if the MCMC SE was not much smaller than the posterior standard deviation then the quoted interval (post mean - post sd, post mean + post sd) would be extremely noisy and not a particularly accurate representation of the posterior.

Hm, now I am confused. I think by MCMC SE you mean “se_mean” in print(), but this is not what I am using to construct uncertainty (sorry for imprecision). I am using post mean - 1,96*post sd (“sd” in print())/sqrt(N_eff). I am surprised that this interval looks so different (in my case tighter) than the empirical quantile-based interval, so I figured this is not a legitimate way to gauge dispersion, or is it?

Thanks again,

post sd / sqrt(N_eff) is the MCMC SE. The quantity that you are quoting is just an interval quantifying the variability of the MCMC mean estimate, as I described above.

That quantity, however, has no meaning when describing the posterior. N_eff is used solely for computing the MCMC SE. If you could compute posterior expectations exactly (or without MCMC) then it would never enter into a well-posed description of the posterior distribution.

I see! I think what is not clear to me is in what cases the MCMC SE is an appropriate measure of dispersion, versus the full Bayesian CI (which I assume is similar in quantity to mean-1.96*post_sd)? For example, when I take the full posterior (i.e.q quantiles) into account when describing uncertainty around an additive combination from a linear model (i.e. predicted values), the full Bayesian CIs are much larger than in the MLE framework, at least usually, but the MCMC SE approach is comparable…


MCMC SE is a measure of accuracy of computation. Like if you wanted to compute the sqrt(2) and I told you my appx_sqrt() function did a pretty good job of calculating square roots, it’s always ± .03 then when appx_sqrt(2) = 1.43 you can see that 1.41 + .02 = 1.43 so 1.43 is in fact within the error that I told you my approximation would have

Think of MCMC SE as a measure of whether your Stan run was “long enough to be sufficiently accurate”

Whenever you want to calculate uncertain quantities about your model they come from quantiles or averages over the posterior samples. It’s only if you want to know whether your posterior samples were sufficient to make those averages close to the true value you’d get if you told Stan to run to the end of time… that you use MCMC SE

You have to be clear about how you are computing the predictions from a model: Are these draws from the posterior predictive distribution? (Predictions of new y values given x that incorporate error terms). Or are these draws from the posterior distribution of fitted values? (Predictions of y-mean given x). One of these will be much larger than the other.

In a quick, simple frequentist model in R, we can compare two intervals:

m <- lm(Sepal.Length ~ Species, iris)
# prediction interval is wider than confidence/fitted-value interval
predict(m, data.frame(Species = "setosa"), interval = "prediction")
#>    fit      lwr      upr
#>1 5.006 3.978533 6.033467
predict(m, data.frame(Species = "setosa"), interval = "confidence")
#>    fit      lwr      upr
#>1 5.006 4.862126 5.149874

Your comment that one kind of interval is larger than the other makes me wonder whether something like this is going on.

to explain further:

If you want to know the posterior mean of your parameter, you can calculate the mean of the samples that stan gives you. How accurate is this calculation (how close does it get you to the true posterior mean defined by the mathematical expressions that Stan evaluates?) it depends on how long you had stan run, and therefore how many effective samples stan gave you.

On the other hand, Suppose you are running a survey, and you want to know what the mean of the survey population is. The accuracy of this parameter is dependent not on how long you run Stan, but on how many people you survey. Running Stan longer will eliminate the inaccuracy of calculation but not the inaccuracy inherent in having a small dataset.

The posterior distribution, and hence posterior quantiles/standard deviations etc tells you how much information you’ve extracted out of your data.

The MCMC standard errors tells you how close your finite set of samples from Stan gets you to the values you would calculate from your posterior if you had an infinitely fast computer and could get an arbitrarily large sample from Stan.

As long as MCMC SE is small enough for your purposes, you can assume that the Stan samples give you an accurate description of the posterior distribution.

Assuming you have enough Stan samples, you can then summarize your uncertainty about the real world by posterior quantiles of your parameters, and ignore the MCMC errors.


Thanks, this is very clearly explained!

All the Best,