Using emmeans to get marginal means from a multinomial logistic regression in brms

I regularly use emmeans to calculate custom contrasts scross a wide range of statistical models. One of its strengths is its versatility: it is compatible with a huge range of packages. I have recently discovered that emmeans is compatible with the brms package, but am having trouble getting it to work. I will conduct an example multinomial logistic regression analysis use a dataset provided here. I will also conduct the same analysis in another package (nnet) to demonstrate what I need.


# read in data
ml <- read.dta("")

The data set contains variables on 200 students. The outcome variable is prog, program type, a three-level categorical variable (general, academic, vocation). The predictor variable is social economic status, ses, a three-level categorical variable. Now to conduct the analysis via the nnet package nnet

# first relevel so 'academic' is the reference level
ml$prog2 <- relevel(ml$prog, ref = "academic")

# run test in nnet
test_nnet <- multinom(prog2 ~ ses, 
                      data = ml)

Now run the same test in brms

# run test in brms (note: will take 30 - 60 seconds)
test_brm <- brm(prog2 ~ ses,
                data = ml,
                family = "categorical")

I will not print the output of the two models but the coefficients are roughly equivalent in both

Now to create an emmeans object that will allow us to conduct pariwise tests

# pass into emmeans
rg_nnet <- ref_grid(test_nnet)
em_nnet <- emmeans(rg_nnet,
                   specs = ~prog2|ses)

# regrid to get coefficients as logit
em_nnet_logit <- regrid(em_nnet,
                        transform = "logit")


# output
# ses = low:
#   prog2      prob    SE df lower.CL upper.CL
# academic -0.388 0.297  6   -1.115   0.3395
# general  -0.661 0.308  6   -1.415   0.0918
# vocation -1.070 0.335  6   -1.889  -0.2519
# ses = middle:
#   prog2      prob    SE df lower.CL upper.CL
# academic -0.148 0.206  6   -0.651   0.3558
# general  -1.322 0.252  6   -1.938  -0.7060
# vocation -0.725 0.219  6   -1.260  -0.1895
# ses = high:
#   prog2      prob    SE df lower.CL upper.CL
# academic  0.965 0.294  6    0.246   1.6839
# general  -1.695 0.363  6   -2.582  -0.8072
# vocation -1.986 0.403  6   -2.972  -0.9997
# Results are given on the logit (not the response) scale. 
# Confidence level used: 0.95 

So now we have our lovely emmeans() object that we can use to perform a vast array of different comparisons.

However, when I try to do the same thing with the brms object, I don’t even get past the first step of converting the brms object into a reference grid before I get an error message

# do the same for brm
rg_brm <- ref_grid(test_brm)

Error : The select parameter is not predicted by a linear formula. Use the 'dpar' and 'nlpar' arguments to select the parameter for which marginal means should be computed.
Predicted distributional parameters are: 'mugeneral', 'muvocation'
Predicted non-linear parameters are: ''
Error in ref_grid(test_brm) : 
  Perhaps a 'data' or 'params' argument is needed

Obviously, and unsurprisingly, there are some steps I am not aware of to get the Bayesian software to play nicely with emmeans. Clearly there are some extra parameters I need to specify at some stage of the process but I’m not sure if these need to be specified in brms or in emmeans. I’ve searched around the web but am having trouble finding a simple but thorough guide.

Can anyone who knows how, help me to get the brms model into an emmeans object?

Have you tried passing the names of your distributional parameters to the mult.levs or mult.names arguments in the ref_grid function?

You can also find how brms built-in support for emmeans works here: Support Functions for emmeans — emmeans-brms-helpers • brms which may give you an idea of how things work behind the scenes.

As the error suggests, you likely just need to specify the parameter(s) that you wish to estimate marginal means for explicitly.

Thanks for commenting @ajnafa. I tried rg_brm <- ref_grid(test_brm, dpar = "mugeneral") and that and got that at least to run but, after passing it through emmeans(rg_brm, specs = ~ses) it threw the warning message: In model.matrix.default(terms, data, ...) : variable 'prog2' is absent, its contrast will be ignored yielding coefs for low , middle , and high of -0.171, -0.786, -1.56. But I don’t actually know what those coefficients represent. I guessed Log-odds of general vs academic (reference) at each level of SES maybe but wasn’t sure.

After putting the same question up on Stack Overflow ( Russ Lenth, the creator of emmeans replied “If it’s like multi or, it is log prob[i] - log prob]1] for i=1,2,…k, k being the number of multi nominal responses.” So I still don’t really know what those coefficients are. In the end I suspect it might be better to go the long way round and just work directly with the mcmc chains. I was hoping to find a way to streamline that process using emmeans but it may not be possible with multinomial models.

1 Like