Manually generate posterior predictions from negative binomial model fit with brms

qdread · November 22, 2022, 6:07pm

I exported the parameters from a model fit with brms so that I can calculate posterior predictions from the model “by hand.” This is because I eventually want to generate predictions in a GUI app coded in a different language where I will be able to import the parameters but won’t have access to the predict.brmsfit() method. However I believe that I am generating the predictions incorrectly. Can someone please take a look at my code and tell me what is incorrect?

My approach is to extract the posterior draws, multiply the beta coefficients times the x values to get posterior mu values, use the function qnbinom() with those mu values and the posterior shape parameter values to get the appropriate negative binomial quantile, then take the median of the resulting vector to get the point estimate. I get very close to the correct answers returned by predict.brmsfit() but there are a few discrepancies. Any help would be appreciated.

exdat <- structure(list(pace = c(18.43, 18.43, 18.43, 18.43, 18.43, 18.43, 
                                 41.64, 41.64, 41.64, 41.64, 41.64, 41.64, 0, 0, 0, 0, 0, 0, 8.09, 
                                 8.09, 8.09, 8.09, 8.09, 8.09, 21.42, 21.42, 21.42, 21.42, 21.42, 
                                 21.42, 44.72, 44.72, 44.72, 44.72, 44.72, 44.72, 0, 0, 0, 0, 
                                 0, 0, 9.18, 9.18, 9.18, 9.18, 9.18, 9.18, 23.19, 23.19, 23.19, 
                                 23.19, 23.19, 23.19, 46.82, 46.82, 46.82, 46.82, 46.82, 46.82, 
                                 0, 0, 0, 0, 0, 0, 9.57, 9.57, 9.57, 9.57, 9.57, 9.57, 23.97, 
                                 23.97, 23.97, 23.97, 23.97, 23.97, 48.08, 48.08, 48.08, 48.08, 
                                 48.08, 48.08), 
                        plac = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.79, 
                                 1.79, 1.79, 1.79, 1.79, 1.79, 2.52, 2.52, 2.52, 2.52, 2.52, 2.52, 
                                 3.22, 3.22, 3.22, 3.22, 3.22, 3.22, 4.02, 4.02, 4.02, 4.02, 4.02, 
                                 4.02, 11.18, 11.18, 11.18, 11.18, 11.18, 11.18, 11.75, 11.75, 
                                 11.75, 11.75, 11.75, 11.75, 12.61, 12.61, 12.61, 12.61, 12.61, 
                                 12.61, 13.47, 13.47, 13.47, 13.47, 13.47, 13.47, 31.67, 31.67, 
                                 31.67, 31.67, 31.67, 31.67, 31.94, 31.94, 31.94, 31.94, 31.94, 
                                 31.94, 32.47, 32.47, 32.47, 32.47, 32.47, 32.47, 33.25, 33.25, 
                                 33.25, 33.25, 33.25, 33.25), 
                        FLR = c(10L, 10L, 3L, 4L, 2L, 2L, 
                                7L, 7L, 2L, 2L, 1L, 1L, 10L, 10L, 6L, 6L, 4L, 4L, 9L, 8L, 2L, 
                                3L, 2L, 2L, 6L, 6L, 2L, 2L, 1L, 1L, 4L, 4L, 2L, 2L, 1L, 1L, 4L, 
                                4L, 2L, 2L, 1L, 1L, 3L, 3L, 2L, 2L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 
                                1L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 
                                1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L)), 
                   row.names = c(NA,  -84L), class = "data.frame")

options(mc.cores = 2, brms.backend = 'cmdstanr')
exmod <- brm(FLR ~ pace + plac,
             data = dat, family = negbinomial(link = 'log', link_shape = 'log'),
             chains = 2, iter = 2000, warmup = 1000, seed = 1110,
             prior = prior(normal(0, 2), class = b))

percentiles <- c(0.5, 0.1, 0.9)
newdat <- expand.grid(pace = c(0, 5, 10, 20), plac = c(0, 5, 10, 20))

# brms predictions
brms_predictions <- predict(exmod, newdata = newdat, probs = percentiles, robust = TRUE)

# my interpretation of how to achieve the predictions
post <- as_draws_df(exmod)

my_pred_fn <- function(pace, plac) {
  mus <- as.matrix(post[, 1:3]) %*% c(1, pace, plac)
  sapply(percentiles, function(p) median(qnbinom(p, mu = exp(mus), size = post$shape)))
}

my_predictions <- t(mapply(my_pred_fn, newdat$pace, newdat$plac))

brms_predictions[, 3:5] - my_predictions # Close but not identical.

scholz · November 24, 2022, 9:54am

predict.brmsfit() wraps posterior_predict, which gives you draws from the posterior predictive distribution. It sounds like what you are doing is closer to posterior_epred, which gives you “expected values of the posterior predictive distribution" (see Expected Value of Posterior vs. Posterior of Expected Value with epred - #3 by JimBob for the difference).
In addition, remember that in bayesian modeling, we are usually modeling an expected value or median for each individual observation, so there is some aleatoric uncertainty in your predictions unless you just go with the posterior mean (which is again the predict vs epred thing).

qdread · December 14, 2022, 5:05pm

Thanks very much for this helpful clarification. Do you have any idea how I can reproduce the behavior of posterior_predict()?

scholz · December 14, 2022, 5:30pm

You can simply run each sample through the rng of your likelihood. Negbinom in your case. Just manually build you linear predictor and apply the inverse link to it before putting it into the rng. You might have to transform between the parameterization of the rng and the one brms uses as well.

qdread · December 14, 2022, 7:35pm

Thanks a lot. I believe your approach is the correct solution and I’ll mark it as such, but just to confirm, this is the new code I used to manually generate the draws from the posterior distributon and get the quantiles.

my_pred_fn <- function(pace, plac) {
  linpred <- as.matrix(post[, 1:3]) %*% c(1, pace, plac)
  draws <- rnbinom(n = nrow(post), mu = exp(linpred), size = post$shape)
  quantile(draws, probs = percentiles)
}

When I set the same random seed before predict.brmsfit and before my own version, they generate identical results:

set.seed(234)
brms_predictions <- predict(exmod, newdata = newdat, probs = percentiles, robust = TRUE)

set.seed(234)
my_predictions <- t(mapply(my_pred_fn, newdat$pace, newdat$plac))

all.equal(unname(brms_predictions[, 3:5]), unname(my_predictions)) #TRUE

scholz · December 14, 2022, 7:44pm

On mobile but looks good to me and if you get the same as predict that’s a good sign :)

Topic		Replies	Views
Brms predictions inconsistent with custom model (likely user error) General	7	409	September 13, 2023
Figuring out what fitted and posterior_predict do in brms - especially for a binomial likelihood brms brms	2	472	April 11, 2023
Computing fitted values for Weibull using posterior samples? brms	9	807	February 4, 2021
Manually computing R^2 and receiving negative output brms	5	585	August 14, 2020
Computing a 95% prediction interval from brms model brms brms	3	686	November 18, 2023

Manually generate posterior predictions from negative binomial model fit with brms

Related topics