Figuring out what fitted and posterior_predict do in brms - especially for a binomial likelihood

bnicenboim · April 11, 2023, 10:36am

TLDR; I don’t understand why fitted() returns values above 1 for family binomial and that lead me to question what exactly fitted() and posterior_predict() do in brms.

My logic seems fine for models with a normal and bernoulli likelihood, but not for a binomial likelihood:

For a linear regression–normal likelihood–, I think it’s straight-forward, fitted should be: draw_intercept + draw_slope * predictor, ignoring sigma, posterior_predict samples from a normal distribution with location equal to the fitted values, and scale equal to draws of the posterior of sigma.
I do it below in R and it checks out, of course, the predictions have the randomness of sigma so they won’t be the same.

library(brms)
library(dplyr)
options(mc.cores = parallel::detectCores())

N <- 15
set.seed(123)
d1 <- tibble(x = runif(N), y = rnorm(N, x*2, .1))
m1 <- brm(y ~ x, data = d1)
post1 <- as.data.frame(m1)

# Fitted values: 
fitted(m1,summary = FALSE) %>% head()
# by "hand"
apply(post1, 1, function(d) d["b_Intercept"] + d["b_x"] * d1$x) %>% 
  t() %>% head()

# predictions
posterior_predict(m1) %>% head()
# by "hand"
apply(post1, 1, 
      function(d) rnorm(nrow(d1), d["b_Intercept"] + d["b_x"] * d1$x, d["sigma"])) %>% 
  t() %>% head()
# They won't be same, but this should be the logic

For a logistic regression with a Bernoulli family, I guess fitted is: inv_logit(draw_intercept + draw_slope * predictor), posterior_predict samples from a bernoulli distribution with theta = fitted value.
I do it below in R and seems fine:

set.seed(123)
d2 <- tibble(x = runif(N), y = rbinom(N, 1, plogis(x*.5)))
m2 <- brm(y ~ x, data = d2, family = bernoulli())
post2 <- as.data.frame(m2)

# Fitted values: 
fitted(m2, summary = FALSE) %>% head()
# by "hand"
apply(post2, 1, function(d) plogis(d["b_Intercept"] + d["b_x"] * d1$x)) %>% 
  t() %>% head()

# predictions
posterior_predict(m2) %>% head()
# by "hand"
apply(post2, 1, 
      function(d) rbinom(nrow(d1),1, plogis(d["b_Intercept"] + d["b_x"] * d1$x))) %>% 
  t() %>% head()

If the family is binomial, I thought the logic of fitted() should be identical to the previous case, but it isn’t. I get fitted values above 1 (see fitted(m3, summary = FALSE) %>% head() below). Of course when I predict, I use a binomial RNG with appropriate size, rather than with a size of 1. So here it is below. Generating predictions, seems fine, but I really don’t get why fitted values can be over 1.

set.seed(123)
d3 <- tibble(x = runif(N), y = rbinom(N, 10, plogis(x*.5)))
m3 <- brm(y | trials(10) ~ x, data = d3, family = binomial())
 
post3 <- as.data.frame(m3)

# Fitted values: 
fitted(m3, summary = FALSE) %>% head()
# by "hand"
apply(post3, 1, function(d) plogis(d["b_Intercept"] + d["b_x"] * d1$x)) %>% 
  t() %>% head()
# this is off

# predictions
posterior_predict(m3) %>% head()
# by "hand"
apply(post3, 1, 
      function(d) rbinom(nrow(d1),10, plogis(d["b_Intercept"] + d["b_x"] * d1$x))) %>% 
  t() %>% head()
# I guess this is fine

paul.buerkner · April 11, 2023, 10:46am

fitted and posterior_epred returns expected values. The expected value of a binomial is ntrials * theta not theta. If you want to get the latter, set your ntrials variable to 1 in newdata and that should do the trick.

bnicenboim · April 11, 2023, 10:52am

ok, that makes sense, thanks! :)

Topic		Replies	Views
Difference between add_fitted_draws() and add_predicted_draws() when applied to brms binomial (Bernoulli) models General techniques , fitting-issues , specification , brms	3	1169	June 15, 2021
Posterior_predict() in brms brms	1	900	June 14, 2020
Fitted and predicted draws on average brms	6	1586	January 18, 2020
Calculating predicted probabilities from a binomial (bernoulli) model brms posterior-predictive , interpret-results	2	2600	July 2, 2020
Counterfactual predictions with credible intervals for bernoulli model brms	4	691	July 18, 2019

Figuring out what fitted and posterior_predict do in brms - especially for a binomial likelihood

Related topics