How to interpret estimates in a 2x2 model with family = categorical

Hello,

I have some hierarchical data where 66 participants in 3 groups each saw 64 trials in four conditions. For each trial, the participants behaviour can be described in one of three categories that are unordered. I would like to model this data using the categorical family in brms. Since I have never worked with categorical models before, I started with modeling some simpler simulated data (no group-level effects, only two conditions and two groups) and am now trying to understand the estimates.

This is the model I ran:

library(tidyverse)
library(brms)

# simulate some data
x = 250
cnd = rep(rep(c("cnd3", "cnd1"), each = 10*x), times = 2) # two conditions
grp = rep(c("A", "B"), each = 10*x*2) # two groups
OUT = c(
  # group A
  rep("1", 3*x), rep("2", 4.5*x), rep("3", 2.5*x),  # cnd3
  rep("1", 2*x), rep("2", 5*x), rep("3", 3*x),      # cnd1
  # group B
  rep("1", 4*x), rep("2", 3.5*x), rep("3", 2.5*x),  # cnd3
  rep("1", 3*x), rep("2", 4*x), rep("3", 3*x)       # cnd1
)
df.full = data.frame(OUT, grp, cnd) %>%
  mutate(
    cnd = as.factor(cnd), 
    grp = as.factor(grp)
  )

# set sum coding for contrasts
contrasts(df.full$cnd) = contr.sum(2)
contrasts(df.full$cnd)
contrasts(df.full$grp) = contr.sum(2)
contrasts(df.full$grp)

# set some priors
priors = c(
  prior(normal(0,1), class = "Intercept", dpar = "mu2"),
  prior(normal(0,1), class = "Intercept", dpar = "mu3"),
  prior(normal(0,1), dpar = "mu2"),
  prior(normal(0,1), dpar = "mu3")
)

# compute the model
m = brm(OUT ~ cnd * grp,
            data = df.full,             
            family = "categorical",   
            backend = "cmdstanr",
            prior = priors,
            chains = 4)

# check out the estimates for the population-level effects 
fixef(m)

# this is my output: 
#                  Estimate  Est.Error        Q2.5       Q97.5
# mu2_Intercept  0.36919383 0.02450578  0.32250145  0.41762155
# mu3_Intercept -0.06147035 0.02623245 -0.11180210 -0.01045539
# mu2_cnd1       0.23298486 0.02430985  0.18667733  0.27981978
# mu2_grp1       0.29130106 0.02451487  0.24367370  0.33819220
# mu2_cnd1:grp1  0.02258405 0.02403750 -0.02390591  0.07105372
# mu3_cnd1       0.26430609 0.02685252  0.21329722  0.31757810
# mu3_grp1       0.17269689 0.02687473  0.11987718  0.22492792
# mu3_cnd1:grp1  0.02980390 0.02711534 -0.02320310  0.08362795

If I understand correctly, the intercepts describe the log odds ratio between OUT2 / OUT3 and OUT1 which is the reference. So, mu2_Intercept is similar to log(nrow(df.full %>% filter(OUT == "2"))/nrow(df.full %>% filter(OUT == "1"))) = 0.3483067.

Then, the condition and group effects refer to the log odds ratio between a certain condition or group for OUT2 / OUT3 and the same in OUT1 minus the respective intercept, so mu2_cnd1 would be similar to log(nrow(df.full %>% filter(OUT == "2" & cnd == "cnd1"))/nrow(df.full %>% filter(OUT == "1" & cnd == "cnd1"))) - log(nrow(df.full %>% filter(OUT == "2"))/nrow(df.full %>% filter(OUT == "1"))) = 0.23948. [EDIT: Or would I have to subtract both Intercept and interaction?]

Is my understanding correct so far?

What I haven’t been able to figure out is what ratio the interaction effects are estimating, so I would appreciate any information on that as well.

Cheers,
Irene Sophia

  • Operating System: Ubuntu 22.04
  • brms Version: 2.19.0
  • cmdstanr Version: 0.5.3
  • cmdstan version: 2.33.1
1 Like

I wouldn’t bother trying to interpret the coefficients in a categorical model directly, unless you’re trying to do so for your personal edification (in which case: respect). To my mind, the simplest way is to use convenience functions like add_epred_draws() to return conditional probabilities, which you can then manipulate as desired. For example, here’s how you can use add_epred_draws() and stat_pointinterval() to make your own hand-made posterior-predictive check.

# load new packages
library(tidybayes)
library(ggdist)

# save the sample-statistic-based conditional probabilities
df.full.probability.summaries <- df.full %>% 
  group_by(grp, cnd) %>% 
  count(OUT) %>% 
  mutate(probability = n / sum(n))

# use add_epred_draws()
df.full %>% 
  distinct(grp, cnd) %>% 
  add_epred_draws(m) %>% 
  
  # make a plot!
  ggplot() +
  geom_col(data = df.full.probability.summaries,
           aes(x = OUT, y = probability),
           fill = "skyblue") +
  stat_pointinterval(aes(x = .category, y = .epred),
                     point_size = 1.5) +
  facet_grid(grp ~ cnd)

For more on nominal models, you might go here or here.

3 Likes

Also, it looks like this is your first post on here. Welcome to the Stan forums, @IreneSophia!

3 Likes

Hello Solomon,

Thanks for your reply and the welcome! I don’t intend to use the estimates for anything in particular, but understanding what exactly they are estimating always helps me to feel more secure in my understanding of what a model does.

I skimmed the references you linked, and I am sure working through them will help me understand categorical models better. Unfortunately, I think there are no examples with interactions, so I’m still looking for materials that go into that a bit more.

Cheers,
Irene Sophia

With the framework I’m suggesting, you would get at the interaction with probability contrasts. A nice way to do this is with help from the tidybayes::compare_levels() function. Starting slow, here’s how to use compare_levels() to compute the B - A contrast, by the two levels of cnd. Note how I’ve added in the sample contrasts within the geom_col() function.

df.full %>% 
  distinct(grp, cnd) %>% 
  add_epred_draws(m) %>% 
  group_by(.category, cnd) %>% 
  compare_levels(variable = .epred, by = grp) %>% 
  
  ggplot() +
  geom_col(data = df.full.probability.summaries %>% 
             select(-n) %>% 
             pivot_wider(names_from = grp, values_from = probability) %>% 
             mutate(`B - A` = B - A),
           aes(x = OUT, y = `B - A`),
           fill = "violetred1"
  ) +
  geom_hline(yintercept = 0, linetype = 2) +
  stat_pointinterval(aes(x = .category, y = .epred),
                     .width = .95) +
  coord_cartesian(ylim = c(-0.15, 0.15)) +
  facet_wrap(~ cnd)

Here’s the reverse contrast, cnd3 - cnd1 by the two levels of grp.

df.full %>% 
  distinct(grp, cnd) %>% 
  add_epred_draws(m) %>% 
  group_by(.category, grp) %>% 
  compare_levels(variable = .epred, by = cnd) %>% 
  
  ggplot() +
  geom_col(data = df.full.probability.summaries %>%
             select(-n) %>%
             pivot_wider(names_from = cnd, values_from = probability) %>% 
             mutate(`cnd3 - cnd1` = cnd3 - cnd1),
           aes(x = OUT, y = `cnd3 - cnd1`),
           fill = "violetred2") +
  geom_hline(yintercept = 0, linetype = 2) +
  stat_pointinterval(aes(x = .category, y = .epred),
                     .width = .95) +
  coord_cartesian(ylim = c(-0.15, 0.15)) +
  facet_wrap(~ grp)

But what you really want (I believe) is a difference in differences. With your data, you can do that 2 ways. First, we compute the difference in the B - A contrast, by the 2 levels of cnd.

df.full %>% 
  distinct(grp, cnd) %>% 
  add_epred_draws(m) %>% 
  group_by(.category, cnd) %>% 
  compare_levels(variable = .epred, by = grp) %>% 
  group_by(.category) %>% 
  # (B - A | cond3) - (B - A | cond1)
  compare_levels(variable = .epred, by = cnd) %>% 
  
  ggplot() +
  geom_col(data = df.full.probability.summaries %>%
             select(-n) %>%
             pivot_wider(names_from = grp, values_from = probability) %>% 
             mutate(`B - A` = B - A) %>%
             select(cnd, OUT, `B - A`) %>% 
             pivot_wider(names_from = cnd, values_from = `B - A`) %>% 
             mutate(`cnd3 - cnd1` = cnd3 - cnd1),
           aes(x = OUT, y = `cnd3 - cnd1`),
           fill = "violetred3") +
  geom_hline(yintercept = 0, linetype = 2) +
  stat_pointinterval(aes(x = .category, y = .epred),
                     .width = .95) +
  coord_cartesian(ylim = c(-0.15, 0.15)) +
  labs(x = "OUT",
       y = "(B - A | cond3) - (B - A | cond1)")

Here’s the reverse.

df.full %>% 
  distinct(grp, cnd) %>% 
  add_epred_draws(m) %>% 
  group_by(.category, grp) %>% 
  compare_levels(variable = .epred, by = cnd) %>% 
  group_by(.category) %>% 
  # (cnd3 - cnd1 | B) - (cnd3 - cnd1 | A)
  compare_levels(variable = .epred, by = grp) %>% 
  
  ggplot() +
  geom_col(data = df.full.probability.summaries %>%
             select(-n) %>%
             pivot_wider(names_from = cnd, values_from = probability) %>% 
             mutate(`cnd3 - cnd1` = cnd3 - cnd1) %>% 
             select(grp, OUT, `cnd3 - cnd1`) %>%
             pivot_wider(names_from = grp, values_from = `cnd3 - cnd1`) %>% 
             mutate(`B - A` = B - A),
           aes(x = OUT, y = `B - A`),
           fill = "violetred3") +
  geom_hline(yintercept = 0, linetype = 2) +
  stat_pointinterval(aes(x = .category, y = .epred),
                     .width = .95) +
  coord_cartesian(ylim = c(-0.15, 0.15)) +
  labs(x = "OUT",
       y = "(cnd3 - cnd1 | B) - (cnd3 - cnd1 | A)")

Again, the results are the same.

Thus, you don’t have much of an interaction going on with these data.

1 Like

These are very good visualisations of what is going on in, thanks!

I’m still a bit unsure what the estimates exactly measure, but at this point I think I conceptually understand well enough what is going on.

1 Like