Unexpected influence from prior in binomial model in brms

joepowers16 · June 17, 2024, 10:44pm

Take this dataframe (df) representing an ABCD test on conversions.

When I take the descriptive means or fit a brms model to df with default uninformed priors I see that recipe A (the Intercept) has a mean of 49.3%.

library(brms)

prob_to_lor <- function(prob){log(prob / (1-prob))}

df <- structure(list(
     recipe = c("A", "B", "C", "D"), 
     events = c(4926L, 4968L, 5051L, 5136L), 
     trials = c(10000, 10000, 10000, 10000), 
     rate = c(0.4926, 0.4968, 0.5051, 0.5136)), 
     row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"))

fit1 <- brm(
    data = df,
    family = binomial,
    formula = "events | trials(trials) ~ 0 + Intercept + recipe",
    cores = 7,
    seed = 42
    ) 

fixef(fit1)['Intercept','Estimate'] |> plogis()

[1] 0.492667

I am perplexed that including a prior for the Intercept with a mean of 0.49 has the effect of increasing the estimate for the Intercept upward to 0.50.

# Convert prior probabilities to log-odds ratios for logistic regression
PRIOR_MU_INTERCEPT  <- .49
PRIOR_SD_INTERCEPT <- .025
PRIOR_SD_RECIPE <- .005

PRIOR_MU_INTERCEPT_LOR <- prob_to_lor(PRIOR_MU_INTERCEPT)
PRIOR_SD_INTERCEPT_LOR <- prob_to_lor(PRIOR_MU_INTERCEPT + PRIOR_SD_INTERCEPT) - prob_to_lor(PRIOR_MU_INTERCEPT)
PRIOR_SD_RECIPE_LOR <- prob_to_lor(PRIOR_MU_INTERCEPT + PRIOR_SD_RECIPE) - prob_to_lor(PRIOR_MU_INTERCEPT)

my_stanvars <- c(
    stanvar(PRIOR_MU_INTERCEPT_LOR, name = "PRIOR_MU_INTERCEPT_LOR"),
    stanvar(PRIOR_SD_INTERCEPT_LOR, name = "PRIOR_SD_INTERCEPT_LOR"),
    stanvar(PRIOR_SD_RECIPE_LOR, name = "PRIOR_SD_RECIPE_LOR")
)

my_priors <- 
    prior(normal(PRIOR_MU_INTERCEPT_LOR, PRIOR_SD_INTERCEPT_LOR), 
          class = b, coef = Intercept) +
    prior(normal(0, PRIOR_SD_RECIPE_LOR), class = b, coef = recipeB) +
    prior(normal(0, PRIOR_SD_RECIPE_LOR), class = b, coef = recipeC) +
    prior(normal(0, PRIOR_SD_RECIPE_LOR), class = b, coef = recipeD)

fit2 <- brm(
    data = df,
    family = binomial,
    formula = my_formula,
    prior = my_priors,
    stanvars = my_stanvars, 
    cores = 7,
    seed = 42
    ) 

fixef(fit2)['Intercept','Estimate'] |> plogis()

[1] 0.4999328

In my mental model, if the Intercept in the raw data has a mean of 0.493, then a prior centered at 0.49 should only be able to bias the posterior estimate toward 0.49, but clearly in this case the posterior estimate is moving away from the prior for the Intercept, and I am struggling to understand how and why.

jsocolar · June 18, 2024, 12:30am

First, note that this is a very small difference, which might just be sampling variation. If you are certain you’ve ruled that out, this result still isn’t necessarily surprising, because the relevant distributions are more than just their means. Suppose, for example, that the mean of the likelihood function is influenced by a very long left-hand tail. If we multiply by a prior that suppresses that left-hand tail, then even if the prior has the same mean as the likelihood function, and even if the prior is symmetric, the mean of the posterior will be higher than the mean of both the prior and the likelihood.

Your intuition is valid only in cases where the prior and likelihood are both symmetric about their shared mean.

Topic		Replies	Views
Confusion about random intercept in binary logistic regression in brms Modeling brms	2	78	November 30, 2024
Logistic brms model: custom prior and interpreting results brms	5	1789	August 26, 2021
Question about sigma posterior not being affected much by sigma prior brms fitting-issues	4	713	December 12, 2020
Make model only based on prior brms fitting-issues , brms	2	671	April 12, 2022
Problems with default prior for family = "binomial" brms	1	1226	July 9, 2018

Unexpected influence from prior in binomial model in brms

Related topics