I’m currently running a categorical brms model that looks as follows:
mdl ← brm (A ~ B + (1 | participant), dataset, family = categorical(), cores = 4, backend = “cmdstanr”)
with A being the categorical outcome variable with 6 levels (a, b, c, d, e, f),
B being a categorical predictor variable with 2 levels (1, 2),
and participant also being a categorical variable.
The model uses a and 1 as reference levels, so my results look like this:
Estimate
Est. Error
l-95% CI
u-95% CI
Rhat
Bulk_ESS
Tail_ESS
mub_Intercept
-0.83
0.26
-1.35
-0.33
1.00
1339
1920
muc_Intercept
-0.59
0.29
-1.18
-0.03
1.00
1028
1451
mud_Intercept
0.01
0.40
-0.79
0.81
1.01
741
1376
mue_Intercept
-0.39
0.30
-1.00
0.20
1.00
905
1564
muf_Intercept
-0.96
0.24
-1.45
-0.50
1.01
995
1845
mub_B2
0.82
0.33
0.19
1.48
1.00
1300
1668
muc_B2
-0.99
0.39
-1.78
-0.23
1.00
1164
1609
mud_B2
-1.81
0.54
-2.96
-0.78
1.01
872
1613
mue_B2
-2.12
0.42
-2.98
-1.35
1.00
1378
1630
muf_B2
0.43
0.31
-0.17
1.06
1.01
1140
1529
I have two questions about these results:
Obviously, all of the intercept results are in comparison to a1, but are the slope estimates also in comparison to a1? For example, is mub_B2 calculated in comparison to the theoretical mua_Intercept at zero or in comparison to the mub_Intercept?
Why is there no estimate for mua_B2? I understand that the intercept for a is at zero due to being the reference level, but it would still be possible to get a slope estimate describe the different between a in 1 vs. 2, right?
I’m sorry for how basic these questions probably are, I have seen results of other similar models looking exactly like this, so I don’t think there are any issues, but the lack of mua_B2 has thrown me for a bit of a loop with regards to interpretation of the results. Would appreciate any input!
Operating system: macOS Ventura 13.6.1
R Version: 4.4.2
R Studio Version: 2024.12.0+467
brms Version: 2.22.0
cmdstanr Version: 0.8.1
Categorical models like this are strange birds, and their parameters are very challenging to interpret directly. Kruschke covered them in Chapter 22 of his text, and I’ve walked that material out with a brm()-based workflow here. I recommend you back up and first fit an intercepts-only version of the model and take some time working through the meanings of the intercepts. Then scale up to a model with your predictor.
Thanks for the recommendations! I had previously looked at your brms version of the chapter already, but now actually went to get the book by Kruschke as well and (re-)read the relevant chapters in both. I think I’ve come to an understanding of how to interpret the estimates, I will attempt to describe it below and if anyone feels motivated to read it and maybe confirm, that would be awesome!
The theoretical mua_Intercept is set at zero and the other _Intercept estimates represent the log odds of the other categories in outcome A occurring within level 1 of predictor B.
The theoretical mua_B2 is also set at zero. Since this is the slope estimate, the regression line for category a thus starts at zero and is completely horizontal, thus ending at zero for level 2 of predictor B as well. The other slope estimates are calculated in comparison to this zero slope. In visual terms: to calculate mub_B2, the zero slope of mua is moved to the mub_Intercept at -0.83, creating a horizontal line at that intercept. The slope mub_B2 is then described in terms of its divergence from that horizontal line, resulting in a regression line from mub_Intercept to the value of b at level 2 (about -0.01 in this case). The estimate thus describes in log odds how much more likely b is to occur in level 2 rather than 1. It does not describe (as I previously worried about) the log odds of b occurring in level 2 compared to a occurring in level 1. Does that sound about right?
I’m still not sure why the model would be calculated this way to be honest. Surely one could use a horizontal line as a reference point that is unrelated to any of the outcome categories? That way a meaningful value could be calculated for mua_B2. Regardless, I am more confident in my interpretation of the results now, so thank you!