How to interpretate the results of a cumulative log model in brms (calculating the probabilities of being in a category))

Hello,

I’m working with an ordered categorical outcome variable in my model and am trying to interpret the results. Specifically, I’m analyzing how workload (measured as antwoordtekst) affects the employee loyalty index (eNPS), which is categorized into three ordered levels.

Here’s the model I used:

formula ← bf(category ~ antwoordtekst + (1 + antwoordtekst || technische_sleutel), family = cumulative(“logit”))

fit_workload_1 ← brm(formula = formula, data = testset_workload, family = cumulative(“logit”), iter = 1000, chains = 2, warmup = 500, cores = parallel::detectCores())

In the output, I have the following regression coefficients:

Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] -6.94 0.17 -7.28 -6.62 1.01 251 437
Intercept[2] -3.50 0.15 -3.80 -3.20 1.01 321 477
antwoordtekst -1.13 0.05 -1.22 -1.04 1.01 295 491

I understand that the Intercept[1] and Intercept[2] represent the thresholds for the categories. Specifically, does Intercept[1] indicate the logit of being in category 2 or higher, and Intercept[2] indicate the logit of being in category 2 or higher? But it is very strange that both of the intercepts are negative. In the data most of the people are in category 3. How do I calculate the probability correctly?

Welcome to the community Charlotte!

The intercepts are thresholds for the logit of being below these levels, rather than above. In addition, the logit for a given observation also includes the effect of antwoordtekst.

The model is returning predictions on the logit scale in the following form:

Below 1st or lowest outcome category: -6.94 - 1.13 * antwoordtekst
Below sum of 1st and 2nd lowest categories: - 3.50 - 1.13 * antwoordtekst

You can turn these into probabilities with the inverse-logit function: 1 / (1 + exp(-x)), where x is one of the linear predictors above (e.g., -6.94 - 1.13 * antwoordtekst). The probability of being in the top category is just one minus the probability of being in the first two categories.

brms has the inv_logit_scaled function to convert from the logit scale to the probability scale, so you can do:

inv_logit_scaled(-3.50 - 1.13 * antwoordtekst)

# Specific example:
inv_logit_scaled(-3.50 - 1.13 * c(0.5, 1, 1.5))
[1] 0.016873391 0.009660523 0.005513647

The fact that the intercepts are relatively negative (along with the negative coefficient for antwoordtekst) indicates that the probability of being in the top outcome category is very high, as you expected from your data.

1 Like

The tidybayes package offers the add_epred_draws() function, which will give you per-category probabilities for a cumulative brms model.

1 Like

Thanks that helps a lot.