Setting an intercept prior on a regression in brms

I have what is probably a very simple problem, but I haven’t found the solution for it yet, and I’d appreciate some help. I’m setting a prior on an intercept, which gives me a summary() that confuses me. Here’s a very simple example.

set.seed(123)
beta_0 <- 5
beta_1 <- 2
sigma <- 10
x <- runif(100, min = 10, max = 100)
mu <- beta_0 + beta_1*x
y <- rnorm(100, mean = mu, sd = sigma)

priors <- c(prior(constant(10), class = "sigma"),
            prior(constant(5), class = "Intercept"))
fit <- brm(y ~ x,
           data = dat, 
           prior = priors,
           chains = 4, 
           iter = 2000,
           refresh = 0)
summary(fit)

This compiles and runs just fine, but look at the output of summary().

 Family: gaussian 
  Links: mu = identity 
Formula: y ~ x 
   Data: dat (Number of observations: 100) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept  -106.04      2.14  -110.11  -101.81 1.00     1407     1749
x             1.97      0.04     1.89     2.04 1.00     1407     1749

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma    10.00      0.00    10.00    10.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The estimate for Intercept isn’t a constant 5 as I expected. Looking at the underlying Stan code I see that b_Intercept is the “actual population-level intercept”, i.e., the intercept after removing the centering that brms does internally. If I dump the output to a data frame, and summarize I get this.

> tmp <- data.frame(fit)
> summary(tmp)
  b_Intercept           b_x            sigma      Intercept     lprior       lp__      
 Min.   :-115.16   Min.   :1.843   Min.   :10   Min.   :5   Min.   :0   Min.   :-6883  
 1st Qu.:-107.49   1st Qu.:1.942   1st Qu.:10   1st Qu.:5   1st Qu.:0   1st Qu.:-6875  
 Median :-106.05   Median :1.968   Median :10   Median :5   Median :0   Median :-6875  
 Mean   :-106.04   Mean   :1.968   Mean   :10   Mean   :5   Mean   :0   Mean   :-6875  
 3rd Qu.:-104.58   3rd Qu.:1.994   3rd Qu.:10   3rd Qu.:5   3rd Qu.:0   3rd Qu.:-6874  
 Max.   : -98.99   Max.   :2.129   Max.   :10   Max.   :5   Max.   :0   Max.   :-6874  
> 

Here Intercept is a constant, 5, as I expect, and the mean reported for Intercept in summary(fit) matches the mean of b_Intercept as expected (once you’ve read the Stan code).

BUT since b_Intercept is calculated in generated quantities, there isn’t a way to put a prior on b_Intercept. Do I just have to remember to ignore Intercept in a summary when I’ve set a prior on it?

I realized that there’s a very simple solution to the problem I posted about 10 minutes after posting it.

If I’m going to treat the intercept as a constant, I simply subtract that constant from the response before fitting the model and remove the intercept from the model with a -1 in the formula.

That solves my immediate problem, but it leaves the more general problem of setting a prior on the intercept. Using prior(<density>, class = "Intercept") places a prior on Intercept in the Stan code (which is looking at the centered data), but the reported intercept is b_Intercept (which is transformed back to the uncentered data). When, if ever, would it make sense to put a prior on Intercept. After all, once the data have been centered, the intercept should be close to 0.

Whatever prior you intend to put on b_Intercept translates directly to a prior on Interceptvia only data-derived constants. Thus, there is a way to put a prior on b_Intercept. Just figure out what the corresponding prior on Intercept is, and apply that prior to Intercept.

Saying it doesn’t make sense to put a prior on Intercept because it tends to be zero is subtly but exactly analogous to saying it doesn’t make sense to put a prior on b_Intercept because it tends to be the mean of the data.

Note that when setting user-defined priors, many users prefer to turn off the automatic centering with 0 + Intercept syntax in the model formula, to avoid precisely this inconvenience.

1 Like