Why do the posterior predictions from fitted model differ from predictions from the same data as "newdata"

pitangus · May 26, 2024, 7:54am

I fitted stimulated count data with the brms code given below. Then I made two predictions with the tidybayes function "add_predicted_draws: first the posterior predictions directly from the fitted model and second using the original fitted data as “newdata”.
The predicted summed, means and standard deviations of the counts were different and I can’t work out why.
I thought of comparing the posterior predictions to new data with different covariate values, but if they predict by a different process then they won’t be legitimate comparisons

library("brms")
library(readr)
library(dplyr)
library(bayesplot)
library(tidybayes)
library(boot)
library(stats)
library(ggplot2)
library(tidyr)
library("RcppParallel")
library(priorsense)
library(cmdstanr)

# fit data
zifamily = zero_inflated_poisson(link = "log", link_zi = "logit")

zip_prior <- c(prior(normal(0,8), class = b),
			   prior(lkj(1), class = cor),
			   prior(student_t(3, 0, 10), class = sd),
			   prior(logistic(0,1), class = Intercept, dpar = zi),
			   prior(student_t(3, 0, 10), class = sd, dpar = zi))
sp_formula = bf(count ~ 0 + Intercept + crop01 + crop02 + crop03 + crop04 + crop05 + (1 || days) + (1 || site) + (1 | g1 | species), zi ~ (1 | g1 | species))

fit <-
  brm(data = my_data,
      family = zifamily,
      formula = sp_formula,
      prior = zip_prior,
      iter = 2000, warmup = 1000, thin = 4, chains = 4, cores = 4,
  	control = list(adapt_delta = .99),
  	threads = threading(8, grainsize = 100),
      seed = 999,
  	  # refresh = 0,
  	backend = "cmdstanr",
  	sample_prior = FALSE
  	)

# predictions

apd = add_predicted_draws(fit$data, fit, ndraws = 1, value = 'pcount')

pred <- add_predicted_draws(fit,
					newdata = newdata, 
					ndraws = 1,
					value = "new_prediction",
					seed = 999,
					summary = FALSE,
					re_formula = ~ 0 + Intercept + crop01 + crop02 + crop03 + crop04 + crop05 + (1||sday) + (1||site) + (1|g1|species),
					allow_new_levels = TRUE)

pred_df = tibble(site = pred$site, species = pred$species, 
					   days = pred$days, data_pred = apd$pcount, 
					   predicted = pred$new_prediction)

Topic		Replies	Views
Request to help understand an apparent discrepancy between tidybayes::add_predicted_draws and brms::posterior_predict brms brms	2	685	January 20, 2021
Difference between add_fitted_draws() and add_predicted_draws() when applied to brms binomial (Bernoulli) models General techniques , fitting-issues , specification , brms	3	1166	June 15, 2021
Fitted and predicted draws on average brms	6	1583	January 18, 2020
Difference between output of brms::posterior_samples and tidybayes::add_fitted_draws(scale = "linear") brms	1	1427	June 25, 2019
Brms / tidybayes predicted values brms	2	584	October 9, 2020

Why do the posterior predictions from fitted model differ from predictions from the same data as "newdata"

Related topics