Brms returning negative estimates of sigma

Apologies for a question which may well have an obvious answer, but I haven’t been able to find one on this forum or elsewhere.

I notice that whenever I model data containing negative values, brms tends to produce severe underestimates (or even negative estimates) of the group-level sigma:

# set seed for reproducibility
set.seed(4321)

# make the data
tempData <- tibble(
  y = rnorm(n = 50, mean = 0, sd = 1),
  x = factor(sample(0:1, size = 50, replace = TRUE))
)

# save the mean and SD for passing into brms
m <- mean(tempData$y)
sDev <- sd(tempData$y)

# convert mean and SD to Stan variables
stanvars <- 
  stanvar(m, name = "m") + 
  stanvar(sDev, name = "sDev") 

# fit the model
tempNorm <- 
  brm(
    data = tempData,
    family = gaussian,
    bf(y ~ 0 + x,      
       sigma ~ 0 + x),               
    prior = c(
      prior(normal(m, sDev * 100), 
            class = b),                 
      prior(normal(0, sDev), 
            class = b,
            dpar = sigma)                
    ), 
    stanvars = stanvars,
    seed = 16,
    save_pars = save_pars(all = TRUE),
    sample_prior = TRUE                 
  )

This produces reasonable estimates of the group-level means - the sigma estimates, though, are problematic:

> tempNorm
 Family: gaussian 
  Links: mu = identity; sigma = log 
Formula: y ~ 0 + x 
         sigma ~ 0 + x
   Data: tempData (Number of observations: 50) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
x0           0.07      0.18    -0.28     0.43 1.00     3660     2901
x1           0.16      0.17    -0.16     0.50 1.00     4094     2873
sigma_x0    -0.17      0.14    -0.43     0.12 1.00     4643     2934
sigma_x1    -0.15      0.13    -0.38     0.11 1.00     4030     2954

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Imposing a lower bound of 0 on sigma leads to estimates which are still far too small. I’ve tried a few solutions (modifying adapt_delta and setting wider or tighter priors), without success.

Thanks in advance to anyone who can tell me what I’m doing wrong here!

  • Operating System: Windows 11
  • brms Version: 2.21.0

Note that sigma has a log link. What you’re getting in terms of output seems to be capturing your synthetic data (y with an sd of 1 [exp(0)] with no relationship to x).

When you attach a linear model to \sigma, brm() defaults to the log link. If you exponentiate those \sigma parameters, you’ll see they aren’t that bad.

Ha. It looks like @franzsf and I gave the same answer at the same time, but @franzsf beat me by a hair.

1 Like

Thank you both. I knew already that brms uses the log scale for sigma (I’m reading your book, @Solomon!), but I thought the exponentiation happened under the hood. Sorry for the trouble.

The question is solved, but a bit of context about WHY the exponentiation doesn’t happen under the hood.

If you had used a formula such as sigma ~ x there would be an intercept and an effect for X1 relative to the intercept. If the coefficients were exponentiated internally, it would be confusing since the sigma for the X1 condition will not be sigma_Intercept + sigma_x1, but rather sigma_Intercept * sigma_x1. And for other link functions it becomes even more complicated. So brms keeps the parameters on the scale at which they were sampled to ensure the linear combination interpretation consistent across all possibilities

1 Like

Thanks @Ven_Popov for the additional explanation. That makes sense and wouldn’t necessarily have occurred to me.

2 Likes