A very short practical guide for Bayesian ordinal models in brms

To help someone understand how to set priors for an ordinal model with a cumulative link in brms, I had to run a bunch of simulations to really get a grasp on it myself. I ended up writing a short post about it:

12 Likes

Loved `ppd_ridgeplot`, super smart visualization!

1 Like

Thanks! I didn’t change the ppc_bars, but I think it would also be a nice way to show the predictions there. The error bars can be very misleading, especially if you get a lot of predictions of 0s and 1s. Not sure what @jonah thinks, but it might be something interesting for bayesplot.

1 Like

If I may suggest, I would show separate overalapping histograms (alpha ~ 0.7) for each read group (0, 1) in Prior predictive distribution for the slope. It should highlight better the impact of the prior for this parameter.

Also, could you explore the impact of choosing different means other than 1 in Intercept-only model: Nudging priors?

Great work!

This is a very nice intro! Though, (if I’ve read the code correctly), I wondering why you assign the priors for the Intercept the same location? If we know that our response is ordered, this seems like a rather strange choice to me.

yeah this seems like a good idea.

I started from Paul Buerkner’s tutorial that does that. You’re right that you can, but it seems like an overkill to me. It could be useful if one has very specific prior information about the thresholds. But it seems that with the order constraint, and one common prior is enough to 1. get them relatively uniformly separated, 2. put more weight to the extremes or 3. nudge them to more probability to the higher or lower categories. And these seems to be the most common scenarios. But I think it’s worth it to add a footnote that this is possible, by setting the coef to numbers 1 to K.

I like it! I’m always open to adding new useful plots to bayesplot. I don’t fit ordinal models too often in my own work, so they are probably a bit neglected when it comes to what bayesplot offers. So it would be great to get some contributions from people who deal with ordinal models more often and have a better sense for what’s useful.

2 Likes

Just wrote the code adapting your function to groups. ppd_ridgeplot with groups Ā· GitHub

I was suprised with the persistent overlap between histograms despite changing the prior for `read`. Can you explain this behavior?

I think it’s kind of expected, the prior for the slope is centered in zero. Beta might increase or decrease the proportions.

In cases like this, I think the prior prediction showing the difference between the conditions might be more informative.

But another issue is that this type of plot doesn’t show how the predictions affect all the categories, because there’s always a trade off, like the increase in category one is linked to the decrease in category 5.

Maybe this kind of plot is better:

And if I look at the difference between read and not read conditions, the prior predictive distributions look even stranger!

I’m not entirely sure why the prior prediction is that the later categories won’t change. Notice that there’s no 5, that’s because there’s no predicted difference between read and no read for 5!

Everything is quite symmetrical here:

priors_full ← c(
prior(normal(0, 3), class = ā€œInterceptā€),
prior(normal(0, 1.5), class = ā€œbā€)
)

I would expect them to be symmetrical. I have no idea what’s going! This is the new doc

2026-01-09-bayesian-ordinal-models-a-very-short-practical-guide.html (8.6 MB)

And if someone wants to take a look at the new code, it’s also here:

This is what you meant, right?

(taus <- qlogis(seq(.2,from=0.2, to = 0.8)))

#priors_better <- c(
#  prior_string(paste0("normal(",taus,", 3)"), class = "Intercept", coef = 1:4)
#)

priors_better <- c(
  prior(normal(-1.39, 3), class = Intercept, coef = 1),
  prior(normal(-0.41, 3), class = Intercept, coef = 2),
  prior(normal(0.41, 3), class = Intercept, coef = 3),
  prior(normal(-.39, 3), class = Intercept, coef = 4)
)



fit_better <- brm( 
  response ~ 1,
  data = df_sim,
  family = cumulative(),
  prior = priors_better,
  sample_prior = "only",
  cores = 4,
  seed = 123,
  refresh = 0,
  control = list(adapt_delta = 0.9)
)

Warning: The global prior 'student_t(3, 0, 2.5)' of class 'Intercept' will not
be used in the model as all related coefficients have individual priors
already. If you did not set those priors yourself, then maybe brms has assigned
default priors. See ?set_prior and ?default_prior for more details.
fit_better |> get_prior()

                prior     class coef group resp dpar nlpar lb ub tag
 student_t(3, 0, 2.5) Intercept                                     
 student_t(3, 0, 2.5) Intercept    1                                
 student_t(3, 0, 2.5) Intercept    2                                
 student_t(3, 0, 2.5) Intercept    3                                
 student_t(3, 0, 2.5) Intercept    4                                
       source
      default
 (vectorized)
 (vectorized)
 (vectorized)
 (vectorized)

It should be working and there’s a bug, or am I doing something silly?

1 Like

This works for me:

library(extraDistr)
library(ggplot2)
library(brms)
library(bayesplot)
library(patchwork)
library(ggridges)
library(dplyr)

theme_set(theme_minimal(base_size = 14))


set.seed(123)
thres <- c(-2, -1, 1.5, 3)
N <- 1000
tau <- thres
beta <- 1.5
eta_noread <- 0 * beta
eta_read  <- 1 * beta

category <- c("Nothing", "A little", "More or less", "Something", "Everything")

df_sim <- data.frame(
  read = rep(c(0, 1), each = N),
  category = c(
    rcat(N, diff(plogis(c(-Inf, tau - eta_noread, Inf))), labels = category),
    rcat(N, diff(plogis(c(-Inf, tau - eta_read, Inf))), labels = category)
  )
) |>
  mutate(response = as.numeric(category))

# Display summary statistics
table(df_sim$read, df_sim$category)


priors_better <- c(
  prior(normal(-1.39, 3), class = Intercept, coef = 1),
  prior(normal(-0.41, 3), class = Intercept, coef = 2),
  prior(normal(0.41, 3), class = Intercept, coef = 3),
  prior(normal(-.39, 3), class = Intercept, coef = 4)
)

fit_better <- brm( 
  response ~ 1,
  data = df_sim,
  family = cumulative(),
  prior = priors_better,
  sample_prior = "only",
  cores = 4,
  seed = 123,
  refresh = 0,
  control = list(adapt_delta = 0.9)
)

fit_better$prior

Or did you mean the warning message? I think that can just be ignored, or at least I have always done so.

The default priors are correctly replaced by the priors we set in any case:

default_prior(fit_better)

            prior     class coef group resp dpar nlpar lb ub tag       source

student_t(3, 0, 2.5) Intercept default

student_t(3, 0, 2.5) Intercept 1 (vectorized)

student_t(3, 0, 2.5) Intercept 2 (vectorized)

student_t(3, 0, 2.5) Intercept 3 (vectorized)

student_t(3, 0, 2.5) Intercept 4 (vectorized)

> fit_better$prior

            prior     class coef group resp dpar nlpar lb ub tag  source

student_t(3, 0, 2.5) Intercept default

 normal(-1.39, 3) Intercept    1                                    user

 normal(-0.41, 3) Intercept    2                                    user

  normal(0.41, 3) Intercept    3                                    user

 normal(-0.39, 3) Intercept    4                                    user

@Johannes_Schwenke , ok you’re right, it works, I just got confused between get_prior() and prior_summary()

1 Like