Regarding prediction/posterior_epred/posterior_predict from brms ordinal model

I have a question that’s both related to brms specifically (primarily prediction/post-processing of outputs from an ordinal model), which I hope that somebody in the community is able to help with.
The problem at hand is secondarily related to optimal modelling of a skewed count outcome with zero-inflation and upper truncation, which I’d be happy to receive modelling suggestions for as well.
Context and a minimal reproducible example code is provided below.

Intended setting: critical care randomised clinical trial comparing two (or more) treatments.
Intended outcome: “days alive without life support at day 90”. This outcome (and similarly defined outcomes) are frequently used in the critical care setting, see e.g. Effect of Hydrocortisone on Mortality and Organ Support in Patients With Severe COVID-19 (although this is only up to day 28, the overall issue is the same). The outcome in this example is registered as a whole number from 0 to 90 for each patient (observation). Often (but not always), patients who die are considered as having the worst possible outcome (0 days, or as in the example in the above link, sometimes -1 days in an ordinal model) regardless of how many days they actually had. In this example below, patients who died are considered to have had an outcome of 0.

The distribution of this outcome is usually highly skewed with both a substantial excess of 0’s and truncation above 90. Here’s R code for producing an example dataset that looks roughly like the usual distribution of this outcome. The only additional variable is treatment assignment, and patients in the treatment group have better outcomes:

set.seed(20210119)
a <- c(90 - rnbinom(1050, mu = 8, size = 15), sample(0:90, 150, replace = TRUE), rep(0, 300))
b <- c(90 - rnbinom(1100, mu = 5, size = 15), sample(0:90, 150, replace = TRUE), rep(0, 250))
dta <- data.frame(treatment = 1:3000 > 1500,
                 res = c(a, b),
                 res_fct = factor(c(a, b), ordered = TRUE))

library(ggplot2)
ggplot(data = dta) +
  geom_bar(aes(x = res_fct, colour = treatment), position = "dodge") +
  scale_x_discrete(labels = ifelse(0:90 %% 5 == 0, 0:90, rep("", 91)))

I’m interested in modelling this outcome (in the actual problem, I want to adjust for additional variables, but that can be ignored for now), and subsequently estimating the mean expected number of days alive without life support for a reference patient in each of the two treatment group (with all additional covariates set to sensible values, e.g. most common category for categorical covariates - not included here). Subsequently, I plan to use this to estimate additional quantities, e.g. the difference in means.

I’ve attempted with multiple models, including a hurdle-negative binomial model, which does not provide a perfect fit to the overall distribution based on posterior predictive checks, but excellently captures the expected mean in each group (which is what I’m primarily interested in).

Example:

# Hurdle-negative binomial model
library(brms)
fit_hnb <- brm(bf(res ~ treatment, hu ~ treatment), family = hurdle_negbinomial(), data = dta,
               cores = 4, backend = "cmdstanr", refresh = 0)
fit_hnb
pp_check(fit_hnb, type = "dens_overlay_grouped", group = "treatment") # Imperfect overall fit
pp_check(fit_hnb, type = "stat_grouped", stat = "mean", group = "treatment") # Predicted means are captured very well and matches input data acceptably

# Use posterior_epred with two "reference patients" to calculate further quantities (e.g. differences in means with 95% CrIs)
pep_hnb <- posterior_epred(fit_hnb, newdata = data.frame(treatment = c(FALSE, TRUE)))
quantile(pep_hnb[, 1] - pep_hnb[, 2], probs = c(0.025, 0.5, 0.975))

This works and may be acceptable for the problem at hand as it adequately describes the quantity of interest (the mean in each group - this is also the case more complicated example datasets). However, as the first posterior predictive check plot reveals, the overall fit is not great. If I instead model the outcome as ordinal using family = cumulative(), I get a near-perfect posterior predictive check plot for hte overall distribution:

# Ordinal model
fit_ord <- brm(res_fct ~ treatment, family = cumulative(), data = dta,
            cores = 4, backend = "cmdstanr", refresh = 0)
fit_ord
pp_check(fit_ord, type = "dens_overlay_grouped", group = "treatment") # Excellent overall fit

This looks a lot better, and the approach is similar to what has been done elsewhere (e.g. in the reference above). It allows me to calculate an odds ratio for the difference between the two groups (under the proportional odds assumption). This is good, but if I want to not only calculate the odds ratio, but also the adjusted expected mean number of days in each group (and derived quantities, as stated above), I’m unsure how to proceed.

If I use posterior_epred for two “reference patients” here, I’m not sure how to convert the results to the actual number of days (or the actual categories of the factor, which could then be easily converted). I assume I have to relate the values to the intercepts, but I am unsure about how to do it.

If I instead use posterior_predict, I get sensible results, but as for other models this seems to predict an overall distribution of potential outcomes and not just the mean. This is not really what I’m interested in. In addition, it seems that posterior_predict generates data that generally matches the input dataset, however, for some reason it appears that it does not generate any values in the lowest (and most common) category, which is confusing. See example code:

pep_ord <- posterior_epred(fit_ord, newdata = data.frame(treatment = c(FALSE, TRUE))) # Unsure how to interpret/convert - use thresholds?
pp_ord <- posterior_predict(fit_ord, newdata = data.frame(treatment = c(FALSE, TRUE))) # Additional uncertainty due to use of posterior_predict and not posterior_epred?

# Comparing percentages of original data:
table(dta$res_fct)/length(dta$res_fct)*100
# With percentages from pp_ord:
table(pp_ord)/length(pp_ord)*100
# Seems to match the input data closely, except that the first level (0) is missing and seems to be merged with the second level (1), even though the levels are in the data? It seems that brms does not predict the most common level at all?

My questions are as follows:

Question 1: Is it possible to use posterior_epred from a brms model with family = cumulative() to obtain expected predictions somehow? I assume that I somehow have to convert the values according to the intercepts/thresholds, but I am not completely sure about how to do it right, and whether this approach really makes sense.

Question 2: Is there an error with posterior_predict in the above example, or how come not a single zero is predicted, while all other values/levels of the factor seem to be reproduced very well?

Question 3, less important: Any additional suggestions for modelling an outcome like this would be very welcome.

Thanks in advance for any help.

  • Operating System: macOS Catalina 10.15.7
  • brms Version: 2.14.4
  • cmdstanr backend, verison 0.3.0 with cmdstan 2.25.0
1 Like

You mention that the data have an upper truncation bound at 90. It might be a good idea to add that into the brms formula for the hurdle negative binomial model. Something like:

fit_hnb <- brm(bf(res|trunc(ub = 90) ~ treatment, hu ~ treatment), family = hurdle_negbinomial(), data = dta, cores = 4, backend = "cmdstanr", refresh = 0)

Will take a while to fit though.

1 Like

Thanks for the suggestion.
I’ve already tried that previously, and have repeated it now. Adding an upper boundary makes sense, however, despite including relevant priors, this leads to problems with chains not finishing succesfully, extreme increases in sampling times (>500 times the speed compared to the same model without the upper boundary) and non-convergence (and increasing chain lengths is not attractive at this speed). Thus, that does not seem to be a practically feasible solution, especially not as the HNB model without an upper boundary seems to recover the parameter of interest in a range of situations (the mean).

Other suggestions are welcome, as are comments on why posterior_predict for the ordinal model doesn’t generate any values in the most common category - I’m still puzzled by this.

Thanks again!

I’ll just bump this in case anybody has any insights to share.

Regarding question #1, I suppose this is not possible as posterior_epred returns a value on the continuous scale, and the outcome only is ordinal and thus it can’t be converted to a value between to ordinal levels (despite the original data being numeric).
Still wondering about question #2 and the lack of values in the most frequent category when using posterior_predict - is this an error with my code or with brms? Tagging @paul.buerkner in case he knows.

Thanks again.

Hi, I found this reply to help many people understanding the three posterior functions commonly used:

Thanks!
However, I am aware of the general differences between the three functions (linpred expected values/means on the linear scale, epred expected values/means transformed to the original outcome scale, and predict being predictions on the original outcome scale including error/uncertainty, and thus essentially randomly generating data from the posterior predictive distribution).
This does not unfortunately not completelysolve the questions above.

  1. Is there any appropriate way to transform the posterior_epred results from an ordinal model to an expected mean on the original scale, when all categories are essentially categorisations of a numeric predictor with equal-spacing between categories? I assume not (since just transforming the values from posterior_epred according to the category thresholds would lead to some information loss by categorisation).
  2. For the second question, why the posterior_predict results for the ordinal model in the above example does not seem to predict any responses in the most common category.
    By looking again, I assume it does, it just returns the factor as an integer instead of the named values and thus the first level, “0”, is coded as 1, and as 1 value between 0 and 90 was not generated in the random data (69) this was omitted. It appears that even if this level is forced in the original factor (when generating the data), it is omitted by brms if it does not exist in the data. So this seems to be the answer to the second question. Thus it’s possible to easily convert integer levels of the factor to the named levels and convert those to integers, if wanted.
1 Like

If anybody else stumbles upon this post, I figured out how to solve issue #1 above.

posterior_epred returns the probabilities of each level of the factor (numbered 1:(total number of levels), disregarding the actual values). As long as all levels in the factor are actual numbers, a mean can be obtained by taking the sum of each actual value (not the recoded levels, but their original values) * the probabilities from posterior_epred, as described by McElreath and in Solomon Kurz’ “translation” of McElreath’s book: 11 Monsters and Mixtures | Statistical rethinking with brms, ggplot2, and the tidyverse

A simple function for this, wrapping posterior_epred, like this can be used (on the model fit from the first post):

post_epred_ord_means <- function(fit, newdata = fit$data){
  dta <- fit$data
  y <- all.vars(as.formula(fit$formula))[1]
  lvls <- sort(unique(as.numeric(as.character(dta[[y]]))))
  pep <- posterior_epred(fit, newdata)
  nsamp <- dim(pep)[1]
  npts <- dim(pep)[2]
  m <- matrix(nrow = nsamp, ncol = npts)
  for (i in 1:nsamp){
    for (j in 1:npts){
      m[i, j] <- sum(pep[i, j, ] * lvls)
    }
  }
  m
}

post_means <- post_epred_ord_means(fit_ord, newdata = data.frame(treatment = c(FALSE, TRUE)))

quantile(post_means[, 1], probs = c(0.5, 0.025, 0.975))
quantile(post_means[, 2], probs = c(0.5, 0.025, 0.975)

This is not a perfect match to the actual means in each group, but the method works.

2 Likes