Computation of contrasts using emmeans based on brms input in R

Hi all,

I have a question about the computation of the estimated marginal means and contrasts (using the emmeans package in R) based on a fitted brms model. Consider the following toy example:

library("emmeans")
library("brms")

# Load the iris dataset
data("iris")

# Fit linear model with Petal.Width as outcome and Species as covariate
set.seed(123); fit <- brm(Petal.Width ~ Species, data = iris)
print(summary(fit, robust=TRUE), digits=10) 

# Generated output:
Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Petal.Width ~ Species 
   Data: iris (Number of observations: 150) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Population-Level Effects: 
                      Estimate    Est.Error     l-95% CI     u-95% CI         Rhat Bulk_ESS Tail_ESS
Intercept         0.2459680662 0.0306554287 0.1886172595 0.3039992100 1.0024083943     2985     2505
Speciesversicolor 1.0798603337 0.0421999664 0.9994487892 1.1599985585 1.0017138236     3222     2806
Speciesvirginica  1.7805708007 0.0416863097 1.7017711722 1.8618851318 1.0010220040     3566     3226

Family Specific Parameters: 
          Estimate    Est.Error     l-95% CI     u-95% CI         Rhat Bulk_ESS Tail_ESS
sigma 0.2058639462 0.0121732295 0.1842404957 0.2311617068 1.0004071432     3846     3082

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

# Estimated marginal means using emmeans package
emm_options(opt.digits = FALSE); options(digits=10)
emmeans(fit, specs = pairwise ~ Species)

Generated output:
$emmeans
 Species          emmean    lower.HPD    upper.HPD
 setosa     0.2459680662 0.1852710892 0.3000677047
 versicolor 1.3270338429 1.2649721923 1.3800841652
 virginica  2.0265604092 1.9699723046 2.0822403090

Point estimate displayed: median 
HPD interval probability: 0.95 

$contrasts
 contrast                    estimate     lower.HPD     upper.HPD
 setosa - versicolor    -1.0798603337 -1.1602782894 -0.9999120115
 setosa - virginica     -1.7805708007 -1.8588519183 -1.6993508287
 versicolor - virginica -0.7011980843 -0.7789274031 -0.6202138869

Point estimate displayed: median 
HPD interval probability: 0.95 

It is unclear to me how the emmeans values (provided in the output above of the emmeans function) are computed. These emmeans are not exactly the same as the estimated means that are ‘manually’ computed based on the medians of the posterior distributions that are shown in the brms above output (except for the intercept). For example, for versicolor: 0.2459680662 +1.0798603337, which yields 1.3258284 (versus 1.3270338429 as provided by emmeans).

Similarly, the contrasts that are shown in the output of the emmeans function are not
exactly the same as the ‘manually’ computed values based on the emmeans (reported by the emmeans function itself). For example, for the setosa - versicolor contrast: 0.2459680662-1.3270338429, which gives -1.081065777 (versus -1.0798603337 as provided by the emmeans function).

It is not clear to me why the contrasts and the emmeans (of the emmeans function) are not ‘in line’ with each other? Can anybody clarify?

  • Operating System: macOS Sonoma 14.3
  • brms Version: 2.20.4
  • emmeans version: 1.8.0

(by the way, If I try to reproduce the emmeans values (provided by the emmeans function) using the posterior distributions that are saved by brms, the results are also not the same. E.g., estimated mean for versicolor:
(median(posterior_samples(fit)[,1])+median(posterior_samples(fit)[,2])), which yields 1.3258284 (versus 1.3270338429 in the emmeans output)

Quite surprisingly, the contrasts values that are reported by the emmeans function can be retrieved exactly based on the posterior_samples function. For example, for the contrast setosa - versiolor:
median(posterior_samples(fit)[,1])-
(median(posterior_samples(fit)[,1])+median(posterior_samples(fit)[,2])), which yields -1.079860334 (which is the same as the value reported by emmeans)

)

I haven’t studied this in detail but could it be a problem with the median of a difference not equaling the difference in medians?