Linear model intercept has unexpectedly small variance

Hi, I’m seeking help understanding what’s happening with my model. I can’t share the specific data, but suppose you had synthetic data that was generated with y = cx + b, with some gaussian noise N(0, 0.25) added to each point. So it’s varying around a regression line with slope c. This is very much what my data resembles, and let’s suppose the true c=0.75 at the population level, and b=0. This is a multilevel model, but the groups are quite similar.

So I’m fitting a model in brms as follows:

priors <- c(prior(normal(0, 1), class=Intercept),
              prior(normal(0.75, 1), class=b),
              prior(normal(0, 1), class=sigma),
              prior(normal(0, 0.5), class=sd))

 brm_int_plus_slope <- brm(y ~ 1 + x + (1 + x|grouping), 
                            data=df_curr, family=gaussian(), prior=priors, 
                            warmup=500, iter=3000, chains=2, cores=2)

The resulting model has a pretty good fit in terms of slope, but the intercept variances are far smaller than I would expect, on the order of +/- 0.01. This results in a very poor capture of the variance around the origin. I would have expected much larger estimated error in the intercept in order to capture this variation around the slope. Model checks looking at mixing, posterior predictive checks, lagged autocorrelation, all look good. Am I mis-specifying my model in some obvious way? Is there another way to specify the priors to allow the intercept to capture more of the variance? Any help would be much appreciated. Thanks.

I think you may be confusing the posterior variance for the intercept and the posterior predictive uncertainty. Posterior predictive inference takes the form:

p(\tilde{y} \mid y) = \int p(\tilde{y} \mid \theta) \cdot p(\theta \mid y) \ \text{d}\theta.

There are two sources of uncertainty here. The posterior p(\theta \mid y) gives you the estimation uncertainty—that’s the posterior for the parameters \theta given the observed data y. The second source of uncertainty is the sampling distribution for new data, p(\tilde{y} \mid \theta). Here, if we have a regression, then it’s \tilde{y_n} \sim \text{normal}(\alpha + \beta \cdot \tilde{x}_n, \sigma), where you get additional sampling variance from the \sigma.

In terms of sampling, if you want to sample \tilde{y}_n given \tilde{x}_n, you want to average over draws for \alpha, \beta, \sigma as follows:

\frac{1}{M} \sum_{m=1}^M \tilde{y}^{(m)},

where

\tilde{y}_n^{(m)} \sim \text{normal}(\alpha^{(m)} + \beta^{(m)} \cdot \tilde{x}_n, \sigma^{(m)})

is sampled using a random number generator and we sample the parameters from the posterior,

\alpha^{(m)}, \beta^{(m)}, \sigma^{(m)} \sim p(\alpha, \beta, \sigma \mid y).