Post-hoc test for multinomial logistic regression brm model (categorical responce)

I apologise as I am very new to this package and I really appreciate any help I can get.

I have a brms model with a categorical response variable (Species) with the following formula, running a multinomial logistics regression (I believe). Species ~ Density_1 + Density_2 + Canopy_Height + Soil_texture + pH + (1 | Fragment).

All the explanatory variables are continuous and represent different habitat characteristics for six different plant species. I want to see which variables are correlated with the presence of which species. I need now to conduct some post hoc analysis for this model since the first species was used as the intercept and I am unable to see the effect of the habitat variables on this first species. I have tried using emmeans and bayestestR but neither seem compatible with the categorical brmsfit model object. Any suggestions for what I can do ?

Below is my model code:

model1_07<- brm(Species ~ 
                   Density_1 +
                   Density_2 +
                   Canopy_Height +
                   Soil_texture +
                   pH
                 + (1 | Fragment)
                 , data = df_scaled_use, 
                 family = categorical(), 
                 iter = 10000, #may need to have upwards of 10000
                 # burn =
                 thin=1,
                 save_pars = save_pars(all = TRUE))

Here is my model output if that is helpful:

 Family: categorical 
  Links: muCprestonianus = logit; muCpsammophilus = logit; muCsaintelucei = logit; muDbrevicaulis = logit; muDscottiana = logit 
Formula: Species ~ Density_1 + Density_2 + Canopy_Height + Soil_texture + pH + (1 | Fragment) 
   Data: df_scaled_use (Number of observations: 120) 
  Draws: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Fragment (Number of levels: 5) 
                              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muCprestonianus_Intercept)     4.83      2.50     1.70    11.16 1.00    10823    11896
sd(muCpsammophilus_Intercept)     1.17      1.05     0.04     3.88 1.00     7864     9312
sd(muCsaintelucei_Intercept)      1.01      0.88     0.04     3.30 1.00     6633     5801
sd(muDbrevicaulis_Intercept)      4.80      2.91     1.61    12.87 1.00     2317      864
sd(muDscottiana_Intercept)        1.46      0.95     0.13     3.78 1.00     6741     6506

Regression Coefficients:
                              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muCprestonianus_Intercept        -1.24      1.91    -5.34     2.26 1.00    12108    11102
muCpsammophilus_Intercept        -0.25      0.93    -2.22     1.44 1.00     8064     8688
muCsaintelucei_Intercept         -0.17      0.85    -1.99     1.45 1.00     8935     5006
muDbrevicaulis_Intercept         -2.78      2.36    -8.33     1.06 1.00    10170     8498
muDscottiana_Intercept            0.16      0.94    -1.70     2.05 1.00    11847    12408
muCprestonianus_Density_1        -1.75      1.10    -4.12     0.22 1.00     4875     1057
muCprestonianus_Density_2         0.72      0.85    -0.76     2.58 1.00    11069    12796
muCprestonianus_Canopy_Height    -0.45      0.57    -1.55     0.71 1.00    15702    12974
muCprestonianus_Soil_texture     -0.64      0.75    -2.16     0.78 1.00     6655     4310
muCprestonianus_pH                1.95      0.76     0.55     3.57 1.00    12281    11766
muCpsammophilus_Density_1        -1.91      0.77    -3.50    -0.46 1.00    10998    12544
muCpsammophilus_Density_2        -2.69      0.81    -4.33    -1.21 1.00    12036    14095
muCpsammophilus_Canopy_Height    -1.29      0.58    -2.47    -0.22 1.00    12979    14650
muCpsammophilus_Soil_texture      0.24      0.57    -0.88     1.38 1.00     9225    11769
muCpsammophilus_pH                0.28      0.66    -1.01     1.61 1.00    11301    11847
muCsaintelucei_Density_1          0.92      0.60    -0.22     2.13 1.00    11401    12351
muCsaintelucei_Density_2         -1.93      0.64    -3.27    -0.76 1.00    10004    11519
muCsaintelucei_Canopy_Height     -1.03      0.55    -2.14     0.03 1.00    12773    14404
muCsaintelucei_Soil_texture      -1.77      0.61    -3.06    -0.65 1.00    11457    12137
muCsaintelucei_pH                 0.37      0.59    -0.78     1.54 1.00    13250    13415
muDbrevicaulis_Density_1         -1.60      0.74    -3.14    -0.22 1.00    10458    11812
muDbrevicaulis_Density_2         -1.89      0.74    -3.42    -0.54 1.00    11121    11950
muDbrevicaulis_Canopy_Height     -0.86      0.57    -1.97     0.26 1.00    13287    14843
muDbrevicaulis_Soil_texture       1.30      0.64     0.09     2.61 1.00    10737    12441
muDbrevicaulis_pH                 1.68      0.71     0.35     3.16 1.00    11590    10872
muDscottiana_Density_1           -0.85      0.70    -2.28     0.48 1.00     8326    12671
muDscottiana_Density_2           -3.00      0.79    -4.67    -1.58 1.00    10508     8612
muDscottiana_Canopy_Height       -2.72      0.70    -4.16    -1.41 1.00    12085    13191
muDscottiana_Soil_texture         0.57      0.60    -0.58     1.76 1.00     9218    11485
muDscottiana_pH                   1.31      0.64     0.07     2.59 1.00     9169     5931

Many many thanks!

I have tried emmeans and bayestestR for doing a post hoc test. I have also been using conditional_effects() to get the effects of my predictors.

Here is simulated data that works in my model. I scaled my original data so this is also scaled.

sim_data <- data.frame(
  Species = sample(c("D. scottiana", "C. psammophilus", "D. brevicaulis", "C. saintelucei", "C. prestonianus", "B. madagascariensis"), 100, replace = TRUE),
  Fragment = sample(paste0("S", 6:10), 100, replace = TRUE),
  Density_1 = rnorm(100, mean = 0, sd = 1),
  Density_2 = rnorm(100, mean = 0, sd = 1),
  Canopy_Height = rnorm(100, mean = 0, sd = 1),
  Soil_texture = rnorm(100, mean = 0, sd = 1),
  pH = rnorm(100, mean = 0, sd = 1)
)

Using the marginaleffects package may be an option, I ran a version of your model (code below) and the avg_slopes() function does output estimates for every category:

library(marginaleffects)

avg_slopes(model_fit, variables = "Density_1")


               Group      Term    Contrast Estimate   2.5 % 97.5 %
 B. madagascariensis Density_1 mean(dY/dX)   0.0275 -0.0247 0.0862
 C. prestonianus     Density_1 mean(dY/dX)   0.0110 -0.0539 0.0757
 C. psammophilus     Density_1 mean(dY/dX)   0.0423 -0.0262 0.1140
 C. saintelucei      Density_1 mean(dY/dX)  -0.0521 -0.1308 0.0224
 D. brevicaulis      Density_1 mean(dY/dX)  -0.0333 -0.1272 0.0618
 D. scottiana        Density_1 mean(dY/dX)   0.0046 -0.0715 0.0807

Now you may run into the slight issue that your effects are non-linear, as there are many categories that you are modelling changes in and they always have to add up to a probability of 1 (I think). I believe what avg_slopes() is doing here is just calculating the average change in the back-transformed probability of each category for each unit increase of Density_1. This may not be exactly what you want.

Here is the code I used to fit the model to sim_data that I fit. I used non-default priors so that I could use less iterations and still get the model to converge. I have no idea if these priors are sensible or not, I just wanted the model to converge in the first run.

model_fit <- brm(Species ~ 
                  Density_1 +
                  Density_2 +
                  Canopy_Height +
                  Soil_texture +
                  pH
                + (1 | Fragment), 
                data = sim_data, 
                family = categorical(), 
                prior = c(
                  prior(normal(0, 1.5), class = "Intercept", dpar = "muCprestonianus"),
                  prior(normal(0, 1.5), class = "Intercept", dpar = "muCpsammophilus"),
                  prior(normal(0, 1.5), class = "Intercept", dpar = "muCsaintelucei"),
                  prior(normal(0, 1.5), class = "Intercept", dpar = "muDbrevicaulis"),
                  prior(normal(0, 1.5), class = "Intercept", dpar = "muDscottiana"),
                  prior(normal(0, 1), class = "b", dpar = "muCprestonianus"),
                  prior(normal(0, 1), class = "b", dpar = "muCpsammophilus"),
                  prior(normal(0, 1), class = "b", dpar = "muCsaintelucei"),
                  prior(normal(0, 1), class = "b", dpar = "muDbrevicaulis"),
                  prior(normal(0, 1), class = "b", dpar = "muDscottiana"),
                  prior(exponential(1), class = "sd", dpar = "muCprestonianus"),
                  prior(exponential(1), class = "sd", dpar = "muCpsammophilus"),
                  prior(exponential(1), class = "sd", dpar = "muCsaintelucei"),
                  prior(exponential(1), class = "sd", dpar = "muDbrevicaulis"),
                  prior(exponential(1), class = "sd", dpar = "muDscottiana")
                ),
                iter = 5000,
                warmup = 3000,
                save_pars = save_pars(all = TRUE),
                backend = "cmdstanr")

To make the output of avg_slopes() more interpretable, here is the summary of my model and the visualized effect of Density_1 using plot(conditional_effects(model_fit, effects = "Density_1", categorical = TRUE)).

summary(model_fit)
 Family: categorical 
  Links: muCprestonianus = logit; muCpsammophilus = logit; muCsaintelucei = logit; muDbrevicaulis = logit; muDscottiana = logit 
Formula: Species ~ Density_1 + Density_2 + Canopy_Height + Soil_texture + pH + (1 | Fragment) 
   Data: sim_data (Number of observations: 100) 
  Draws: 4 chains, each with iter = 5000; warmup = 3000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Fragment (Number of levels: 5) 
                              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muCprestonianus_Intercept)     0.48      0.46     0.01     1.68 1.00     3662     3572
sd(muCpsammophilus_Intercept)     0.44      0.40     0.01     1.45 1.00     4054     3964
sd(muCsaintelucei_Intercept)      0.46      0.40     0.02     1.49 1.00     3263     3620
sd(muDbrevicaulis_Intercept)      0.36      0.33     0.01     1.20 1.00     3779     4147
sd(muDscottiana_Intercept)        0.47      0.42     0.01     1.51 1.00     3895     4091

Regression Coefficients:
                              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muCprestonianus_Intercept        -0.03      0.49    -1.03     0.93 1.00     5346     5665
muCpsammophilus_Intercept        -0.13      0.51    -1.17     0.83 1.00     5727     5813
muCsaintelucei_Intercept          0.15      0.51    -0.87     1.10 1.00     4565     2999
muDbrevicaulis_Intercept          0.84      0.42    -0.00     1.65 1.00     5019     5026
muDscottiana_Intercept            0.41      0.46    -0.50     1.33 1.00     5107     5259
muCprestonianus_Density_1        -0.15      0.41    -0.96     0.65 1.00     6780     5919
muCprestonianus_Density_2         0.12      0.38    -0.61     0.88 1.00     5936     5732
muCprestonianus_Canopy_Height     0.44      0.40    -0.33     1.22 1.00     6564     6229
muCprestonianus_Soil_texture      0.15      0.37    -0.58     0.89 1.00     6631     5778
muCprestonianus_pH                0.30      0.39    -0.47     1.07 1.00     6977     6557
muCpsammophilus_Density_1         0.13      0.43    -0.69     0.97 1.00     6748     5961
muCpsammophilus_Density_2        -0.06      0.39    -0.84     0.71 1.00     6154     5681
muCpsammophilus_Canopy_Height     0.34      0.39    -0.43     1.09 1.00     6978     6580
muCpsammophilus_Soil_texture      0.50      0.38    -0.23     1.24 1.00     6627     6022
muCpsammophilus_pH               -0.52      0.40    -1.32     0.25 1.00     6300     6309
muCsaintelucei_Density_1         -0.66      0.40    -1.47     0.12 1.00     6556     6042
muCsaintelucei_Density_2         -0.53      0.36    -1.25     0.17 1.00     6099     5741
muCsaintelucei_Canopy_Height     -0.11      0.39    -0.89     0.63 1.00     6174     6283
muCsaintelucei_Soil_texture       0.71      0.35     0.04     1.40 1.00     6205     6029
muCsaintelucei_pH                -0.10      0.36    -0.79     0.63 1.00     6561     6267
muDbrevicaulis_Density_1         -0.42      0.36    -1.12     0.29 1.00     6191     5685
muDbrevicaulis_Density_2         -0.56      0.33    -1.20     0.06 1.00     5723     5968
muDbrevicaulis_Canopy_Height      0.19      0.34    -0.49     0.88 1.00     6102     5835
muDbrevicaulis_Soil_texture       0.45      0.31    -0.17     1.06 1.00     5840     6113
muDbrevicaulis_pH                -0.26      0.33    -0.90     0.40 1.00     5949     5771
muDscottiana_Density_1           -0.22      0.38    -0.97     0.54 1.00     6621     5543
muDscottiana_Density_2            0.28      0.35    -0.39     0.99 1.00     5659     5705
muDscottiana_Canopy_Height        0.60      0.37    -0.12     1.32 1.00     6543     6244
muDscottiana_Soil_texture         0.06      0.33    -0.62     0.70 1.00     5963     6264
muDscottiana_pH                  -0.09      0.36    -0.78     0.60 1.00     6199     5565

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

And here is the visualized effect of Density_1:

Hi ! Thank you so much for your quick reply!

I have tried avg_slopes on my model and it is giving me results. I am not quite sure what you meant by this though? I am not sure that the effects are as unlinear as you suspect.

Also below is the real visualised effect of Density_1 with the data. I think the sim_data I gave you isnt simulating the data very well, sorry about that.

plot(conditional_effects(model1_07, effects = "Density_1",
                         categorical = TRUE, 
                         re_formula = NULL))

And here is the output of avg_slopes(model1_07, variables = "Density_1"):


Group      Term    Contrast Estimate   2.5 %  97.5 %
 B. madagascariensis Density_1 mean(dY/dX)   0.0405 -0.0184  0.1055
 C. prestonianus     Density_1 mean(dY/dX)  -0.0381 -0.1170  0.0220
 C. psammophilus     Density_1 mean(dY/dX)  -0.1105 -0.2034 -0.0204
 C. saintelucei      Density_1 mean(dY/dX)   0.1357  0.0647  0.2111
 D. brevicaulis      Density_1 mean(dY/dX)  -0.0476 -0.1151  0.0207
 D. scottiana        Density_1 mean(dY/dX)   0.0217 -0.0500  0.0922

Columns: term, group, contrast, estimate, conf.low, conf.high, predicted_lo, predicted_hi, predicted, tmp_idx 
Type:  response

I really appreciate any insight into this! Could I be at risk of simplifying the analysis too much by using avg_slopes in this way ?

Now that I’m rereading this documentation on slopes in the marginaleffects package, I wasn’t very accurate with my description of the avg_slopes() function. Which is really the average of all the observation-specific slopes based on your data. I’d recommend reading the documentation I linked and deciding how best to present your analysis based on your data.

I’m still in the process of understanding the differences between the different options that marginaleffects can calculate for you, but maybe someone else who understands these things better can help you out.

It’s not a paper that uses brms, but this tutorial might give you some useful things to consider when fitting multinomial models: