Ordinal regression model: transform intercepts (thresholds) to estimate on the response scale

I’m fitting an ordinal regression model and I would like to transform the draws of the model to obtain the posterior distributions of the estimates on the original response scale (for example, a 5-point Likert scale).

As an example, I here create a fake dataset:

library(brms)

nSubj <- 100
dataSurvey <- data.frame(
  SubjectID = rep(seq(1, nSubj), 2),
  Condition = rep(c("Control", "Treatment"), each = nSubj),
  Response = c(
    sample(c(1, 2, 3, 4, 5), size = nSubj, prob = c(1/8, 1/2, 1/4, 1/8, 0), replace = TRUE),
    sample(c(1, 2, 3, 4, 5), size = nSubj, prob = c(0, 1/8, 1/4, 1/2, 1/8), replace = TRUE)
  )
)
dataSurvey$SubjectID <- factor(dataSurvey$SubjectID)

Then I fit the ordinal regression (probit) model:

priors_fit <- c(
  prior(normal(0, 4), class = Intercept),
  prior(normal(0, 4), class = b)
)

fit <- brm(
  Response ~ 1 + Condition + (1|SubjectID),
  family = cumulative("probit"),
  data = dataSurvey,
  prior = priors_fit,
  inits = 0,
  chains = 4,
  cores = 4,
  warmup = 1000,
  iter = 2000,
  sample_prior = TRUE
)

The object fit contains the intercept (thresholds) and the estimate of the categorical variable Condition. With as_draws_df I can obtain the individual draws, but I don’t know how to transform them to obtain the posteriors on the response scale. In other words, I would like to obtain the estimates (for all draws) on the same scale as the one display here in the column labeled estimate__:

conditional_effects(fit)$Condition

which are the estimated averages of the Control and Treatment groups in my dataset.

library(tidyverse)
dataSurvey %>% 
  group_by(Condition) %>% 
  summarise(
    mResponse = mean(Response)
  )

Thanks!

There are two ways to look at the fitted outputs of the model (I’m guessing you’re already familiar with the first here!):

You can either look at the probability per category for each draw using posterior_epred:

pred_probabilities <- posterior_epred(fit,summary=FALSE)

> head(pred_probabilities[1,,])
               1         2         3          4            5
[1,] 0.061565611 0.6475853 0.2169693 0.07298350 0.0008962115
[2,] 0.075882115 0.6692796 0.1949755 0.05924615 0.0006166467
[3,] 0.004975054 0.3089023 0.3458414 0.32182974 0.0184515003
[4,] 0.102613516 0.6929474 0.1619454 0.04215369 0.0003399055
[5,] 0.015978267 0.4631726 0.3215983 0.19337377 0.0058770833
[6,] 0.025925836 0.5330234 0.2930140 0.14477291 0.0032637923

Or you can request the actual predicted category using posterior_predict:

pred_categories <- posterior_predict(fit,summary=FALSE)
> head(pred_categories[1,])
[1] 2 3 2 2 2 2

I have found a solution to my above problem. It provides me with the estimates I was looking for, but the steps to obtain them are very fiddly. I would be very interested to know if there is a simpler way of getting to the same solution.

Continuing from the example provided in the question we extract the draws of the posterior related to the intercept (thresholds) and to the categorical variable Condition:

post <- as_draws_df(fit) %>% 
  select(
    starts_with("b_")
  )

Then, we can compute the estimated average value on the response (Likert) scale for each draw:

postEst <- post %>% 
  mutate(
    EstControl = 1*pnorm(`b_Intercept[1]`) + 
      2*(pnorm(`b_Intercept[2]`) - pnorm(`b_Intercept[1]`)) +
      3*(pnorm(`b_Intercept[3]`) - pnorm(`b_Intercept[2]`)) +
      4*(pnorm(`b_Intercept[4]`) - pnorm(`b_Intercept[3]`)) +
      5*(1 - pnorm(`b_Intercept[4]`)),
    EstTreatment = 1*pnorm(`b_Intercept[1]` - b_ConditionTreatment) + 
      2*(pnorm(`b_Intercept[2]` - b_ConditionTreatment) - pnorm(`b_Intercept[1]` - b_ConditionTreatment)) +
      3*(pnorm(`b_Intercept[3]` - b_ConditionTreatment) - pnorm(`b_Intercept[2]` - b_ConditionTreatment)) +
      4*(pnorm(`b_Intercept[4]` - b_ConditionTreatment) - pnorm(`b_Intercept[3]` - b_ConditionTreatment)) +
      5*(1 - pnorm(`b_Intercept[4]` - b_ConditionTreatment))
  )

If we take the median of EstControl and EstTreatment we obtain the exact same values as the ones found in the column labeled estimate__ when executing conditional_effects(fit)$Condition:

median(postEst$EstControl)
median(postEst$EstTreatment)

But, again, although the above steps lead me to the correct solution, I would be very grateful if someone (@paul.buerkner) could point me to a simpler solution.

Thanks, I’ll explore the posterior_predict route. In the mean time I have found a solution to my issue (see my reply to the original question), but I’m still interested to know if there is a more straightforward way of obtaining the estimates I’m looking for.