Check Interpretation for a logit brms model


I wonder if I can pick anyone’s brains to make sure i’m interpreting something right.
In a bayesian model, I am testing whether response sentiment (categorical: positive, negative, or neutral) varies by group (1,2,3).
I am using a logit model. I have set the reference as ’sentiment: negative: and group:1

Formula: sentiment ~ 1 + group

Population-Level Effects: 							
                       	 Estimate 	Est.Error 	l-95% CI	 u-95% CI 	Rhat	 Bulk_ESS	 Tail_ESS
muneutral_Intercept	       -3.62	 0.3	    -4.23	    -3.1	     1	     59104	     59340
mupositive_Intercept	   -1.7	     0.17	    -2.01     	-1.34	     1	     62355	     63736
muneutral_group 2	        1.3	     0.24	     0.85	     1.78	     1	     109984	     101582
muneutral_group 3	        1.36	 0.24	     0.9	     1.84	     1	     110194	     103026
mupositive_group 2	        0.38	 0.13	     0.12	     0.63	     1	     138837	     110094
mupositive_group 3	        0.38	 0.13	     0.12	     0.64	     1	     138532	     113543

My current interpretation:
The estimate for muneutralgroup 2 is how likely group 2 is to be neutral, rather than negative, compared to group 1?
Similarly, the estimate for mupositive group 3 is how likely group 3 is to be positive, rather than negative, compared to group 1?

Am I right in thinking that if I wanted to see the difference in how many positive, negative and neutral scores by group I would need to do a dummy variable of the three sentiments instead?
Thanks in advance, would really value anyone’s thoughts! This model has been a pain for a long time, and i keep coming back to it.

Thank you


Are you using the categorical family in brms? If so, why not an ordinal model? It seems to me that negative, neutral, and positive, are ordered data, no? Seems like you might should be using an ordinal model.

I believe it’s the log-odds for neutral vs negative for group 2 compared to group 1. If you exponentiate it (the MCMC samples for the coefficient and then apply summary statistics), it will be the odds ratio for neutral for group 2 compared to group 1.
I think you can show yourself this with something like:


set <- c("A","B","C")
d <- sample(set, size=1000, replace=T)
d1 <-
set <- c("gr1","gr2","gr3")
g <- sample(set, size=1000, replace=T)
d1$g <- g
d1$g <- factor(d1$g)

m1 <- brm(d ~ 1 + g, family=categorical, data=d1, cores=4)
s1 <- as_draws_df(m1)

#OR from model
co <- exp(s1$b_muB_ggr2)
quantile(co, c(0.025, 0.975))

#OR from data; compare above to
B_1 <- as.numeric(table(d1$d[d1$g=="gr1"])[2])
A_B_1 <- (as.numeric(table(d1$d[d1$g=="gr1"])[1]) + as.numeric(table(d1$d[d1$g=="gr1"])[2]))
O_1 <- (B_1/A_B_1)/(1 - (B_1/A_B_1))

B_2 <- as.numeric(table(d1$d[d1$g=="gr2"])[2])
A_B_2 <- (as.numeric(table(d1$d[d1$g=="gr2"])[1]) + as.numeric(table(d1$d[d1$g=="gr2"])[2]))
O_2 <- (B_2/A_B_2)/(1 - (B_2/A_B_2))

OR <- O_2/O_1

You should be able to calculate any quantity of interest using fitted() and a new data frame with the conditions that you wish. For example, if you wanted to see the proportion of sentiment that was negative, neutral, and positive for all the groups then you could make a small dataframe with all the groups and then run fitted(my_model, newdata=my_new_data). If you wish to do some other calculations then you can add the summary=FALSE argument to return the samples instead of the summary stats, then extract the samples and use them as needed.



Thank you so much for such a comprehensive answer! Good question re. ordinal. To be honest an ordinal model was the first thing I tried a few years ago (regression - New to Bayesian stats, using ordered categorical data, how to interpret/check model? - Stack Overflow) but at the time I simply could not understand, or interpret it so I gave up in favour of something I understood slightly more. I asked here too, but I was so new to bayesian stats that I got completely wrapped up in understanding the link, before I understood anything about priors (Selecting a link function - #3 by martinmodrak) Maybe time to revisit it…I’ll have a think and give your suggestions a go as well. Watch this space… :)

argh. I remember why I gave up. I didn’t know how to interpret the output of the ordinal model because the predictor is categorical (group 1,2,3). Even if I make them ordered factors, the output is group.L and group.Q and I have no idea how to interpret! Should I Make a new Q?

edit: I need dummy variables for group! At this point I’m just word vomiting on this question


I found the following article : Fitting and Interpreting a Proportional Odds Model | University of Virginia Library Research Data Services + Sciences

To be valuable for understanding how to interpret/use the output of a cumulative probability model–I want to say that’s what you’ve fit in your first post;

logit[P(Y \leq j)] = \alpha_j + \beta_1X_{group2} + \beta_2X_{group3}

But I’m unfamiliar with brms, and I’m not 100% sure what the mu___ terms represent. That article works through an application of the above logit model, using a nominal/unordered categorical predictor. I hope it helps!


Hi @jd_c

I’ve had a really good think about this, and ran an ordinal model on the data too.
I think that my original model is right, because what I should have specified in my question here, is that I am interested in how much more likely the three different groups are to report positive/negative/neutral sentiment responses.
I am interested in whether group 1 has more/less negative responses, more positive, more neutral compared to group 2, and if group 2 has more/less negative responses, more positive, more neutral compared to group 3, and if group has 3 more/less negative responses, more positive, more neutral compared to group 1. And if I understand, an ordinal model will tell me where on the order along negative->neutral->positive responses for group 1,2 and 3 fall, rather than if there are more negative responses in a particular group. Am I right in thinking this way? In that case the log odds as you showed will be the best way for me to interpret, I think.

The code for my original model in r is: <- brm(
  formula = sentiment ~ 1 + group,
  data = my_data,
  prior = m1priors,
  family = categorical(refcat = "negative"),
  seed = 64, combine = TRUE, chains = 5, control = list(adapt_delta = 0.99),
  iter = 10000,warmup = 4000,
  cores = 5

Would it help if I put up some reproducible data? It’s really helpful to be able to talk through this, thank you for your time.

As @jd_c stated, you are correct on the interpretation except that it is the difference in log-odds. In your case, the model is trying to estimate three different equations, one for each level of the outcome. Traditionally (i.e. in frequentist settings), it is made tractable by setting the first equation to 0, which aligns with your reference category (negative).

\begin{pmatrix} Y = Negative \\ Y = Neutral \\ Y = Positive \end{pmatrix} = \begin{pmatrix} 0 \\ \alpha_n + \beta_{n2} G_2 + \beta_{n3} G_3 \\ \alpha_p + \beta_{p2} G_2 + \beta_{p3} G_3 \end{pmatrix}

The link function is the softmax function, which transforms the linear, unbounded log-odds into a proper set of probabilities that sum to 1.

softmax(X) = \frac{e^x}{\sum(e^x)}

Note that this reduces to the typical (binary) logistic model with only two categories.

\frac{e^x}{e^0 + e^x} = \frac{e^x}{1 + e^x}

Putting this together, the categorical model is below.

Y \sim Categorical(softmax\begin{pmatrix} 0 \\ \alpha_n + \beta_{n2} G_2 + \beta_{n3} G_3 \\ \alpha_p + \beta_{p2} G_2 + \beta_{p3} G_3 \end{pmatrix})


  • \alpha_n = muneutral_Intercept
  • \alpha_p = mupositive_Intercept
  • \beta_{n2} = muneutral_group 2
  • \beta_{n3} = muneutral_group 3
  • \beta_{p2} = mupositive_group 2
  • \beta_{p3} = mupositive_group 3

Per cumulative ordinal models, you can think of them as more constrained versions of the categorical models. For the categorical model, you’re estimating one equation for all but one level of the outcome. For the cumulative ordinal model, you’re estimating one single equation with separate thresholds (or intercepts) for all but one level of the outcome. So in your case, the categorical model would have 6 parameters while the ordinal model would collapse it down to 4 by assuming that \beta_{n2} = \beta_{p2} and \beta_{n3} = \beta_{p3}.

You could still compare the probability of each outcome level for all three groups using the predict/fitted commands described above. It would just be making a stronger assumption than the categorical model which you could test, as described at the end of the guide shared by @PatrickStar . That said, the ordinal model would not allow a situation where, for instance, Group 1 was high on both negative and positive, while Group 2 was high on neutral. So if you think that groups differ in their baseline probability of reporting neutral (regardless of how positive/negative they are), then the ordinal model may be too restrictive.


I agree that since you don’t want to impose constraints on the sentiment probabilities across the three groups, you should use the categorical family. And you don’t need to interpret the model coefficients directly; the probabilities P(sentiment|group) are easier to understand. Using the posterior probabilities, you can make all kinds of comparisons between the groups. For example, to answer “Does group 1 have more negative responses than group 2?” you can estimate “What’s the difference in the probability of a negative response if I survey a person from group 1 and a person from group 2?” Following the suggestion by @jd_c, you can calculate this as:

reponse_group1 <- fitted(, newdata = tibble(group = "1"), summary=FALSE)
reponse_group2 <- fitted(, newdata = tibble(group = "2"), summary=FALSE)

posterior::summarise_draws(reponse_group1 - reponse_group2)

In fact it makes three comparisons between groups 1 and 2 at once, one for each sentiment.


If you are just trying to summarize group responses to particular categories then your categorical model is doing that. @simonbrauer gives a nice summary. It depends on the goal. However, I would point out that this doesn’t change the nature of your response variable, which is ordered categories, not unordered categories. From a generative modeling approach, you can probably think of your categories of sentiment as a categorized version of a latent continuous variable setiment_real. Why? Because it’s highly doubtful that people feel in terms of three distinct categories - that’s just a convenience summary for you, because you don’t have a magical test capable of detecting sentiment_real. Paul-Christian Burkner and Matti Vuorre have a nice tutorial on ordinal models that has friendly explanations of the different types of ordinal models that one can fit in brms The assumptions about sentiment that I described above can be implemented with a cumulative probit model. @Solomon has a nice explanation of it as well Notes on the Bayesian cumulative probit | A. Solomon Kurz
As @simonbrauer points out, the cumulative model makes assumptions about the predictor having the same effect on all response categories. If you are willing to make different assumptions regarding the ordinal response variable, then you can fit a different kind of ordinal model (sequential or adjacent categories) and relax the assumption of predictors having the same effect across response categories. In the paper that I linked, the authors refer to this as “category-specific effects”.
Also, no matter which model you fit, you can still get the probabilities for each response category for each group (and any quantity of interest that you want to calculate) as I suggested in my previous response, and @desislava elaborated on, by using the fitted() function.
Hope that helps!


Great general advice. Also important to keep in mind that in this case there is a single categorical predictor: groups 1, 2, 3; we don’t know much about the groups, I assume the groups are “unordered”. This means that the ordinal model with category-specific effects (ie. if the assumption the groups have the same (relative) effect on sentiment is relaxed) would give about the same results as the categorical model, so perhaps the extra complexity may not required in this case. But maybe there are extra predictors?

1 Like

thank you for all the responses! I’m working through this slowly, and I want to check a few things:

in the example, @desislava said:
reponse_group1 <- fitted(fullunimputed, newdata = tibble(primetype = "1"), summary=FALSE)

so I have imputed and unimputed models. The unimputed are the same, but to run tests quickly and check things, so I’ll use them to test my understanding and run through your suggestions with.

Q1: How might I do fitted() on an imputed model (brm_multiple)
Q2: I’m not familiar with fitted(), is it similar to as.draws?
Q2: As @desislava guessed, my model is a little more complicated (though only slightly!) I add in standardised age (z.age), and whether or not they smoke (0,1). I’ll put some reproducible data below. When using fitted and a new dataframe, I assume I need to also include other variables from the model? If I incorporate smoking and z.age, I would guess I do so holding z.age at its mean, and smoker at its mode?

I like the suggestion to answer in probabilities, as @desislava’s code said, so I’m trying to understand exactly what is going on there.

# Create the ID column
ids <- sprintf("%04d", 1:30)

# Create the sentiment column
sentiments <- c("negative", "positive", "neutral")
sentiment_col <- sample(sentiments, 30, replace = TRUE)

# Create the group column
groups <- rep(1:3, each = 10)

# Create the smoker column
smoker_col <- sample(0:1, 30, replace = TRUE)

# Create the z.age column
z_age_col <- rnorm(30)

# Create the dataframe
df <- data.frame(ID = ids,
                 sentiment = sentiment_col,
                 group = groups,
                 smoker = smoker_col,
                 z.age = z_age_col)

# Run the model
unimputed_mod <- brm(
  formula = sentiment ~ 1 + group+z.age +smoker,
  data = df,
  family = categorical(refcat = "negative"),
  seed = 64, chains = 5, control = list(adapt_delta = 0.999),
  iter = 1000,warmup = 400,
  cores = 5
toc() #17.716 sec elapsed

# Family: categorical 
# Links: muneutral = logit; mupositive = logit 
# Formula: sentiment ~ 1 + group + z.age + smoker 
# Data: df (Number of observations: 30) 
# Draws: 5 chains, each with iter = 1000; warmup = 400; thin = 1;
# total post-warmup draws = 3000

# Population-Level Effects: 
#   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# muneutral_Intercept     -1.44      1.47    -4.42     1.40 1.00     2349     2149
# mupositive_Intercept    -1.95      1.44    -4.97     0.67 1.00     2195     1784
# muneutral_group          1.02      0.73    -0.34     2.51 1.00     1920     2012
# muneutral_z.age         -0.05      0.79    -1.57     1.56 1.00     2000     1896
# muneutral_smoker        -2.07      1.25    -4.72     0.24 1.00     1950     1672
# mupositive_group         0.82      0.65    -0.35     2.14 1.00     2015     2006
# mupositive_z.age        -0.72      0.81    -2.36     0.76 1.00     1820     1604
# mupositive_smoker        0.35      1.09    -1.69     2.55 1.00     2034     2050
# Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
# and Tail_ESS are effective sample size measures, and Rhat is the potential
# scale reduction factor on split chains (at convergence, Rhat = 1).

IF I run a model without age and smoker:


unimputed_mod1 <- brm(
  formula = sentiment ~ 1 + group,
  data = df,
  family = categorical(refcat = "negative"),
  seed = 64, chains = 5, control = list(adapt_delta = 0.999),
  iter = 1000,warmup = 400,
  cores = 5

reponse_group1 <- fitted(unimputed_mod1, newdata = tibble(group = "1"), summary=FALSE)
reponse_group2 <- fitted(unimputed_mod1, newdata = tibble(group = "2"), summary=FALSE)

posterior::summarise_draws(reponse_group1 - reponse_group2)

# variable   mean median    sd   mad      q5    q95  rhat ess_bulk ess_tail
# <chr>     <num>  <num> <num> <num>   <num>  <num> <num>    <num>    <num>
# 1 negative  0.393  0.403 0.186 0.187  0.0608  0.682  1.00    2029.    2158.
# 2 neutral  -0.497 -0.508 0.173 0.174 -0.760  -0.197  1.00    2677.    2385.
# 3 positive  0.104  0.102 0.185 0.187 -0.195   0.411  1.00    2597.    2174.

Now the model output gives log odds, is the reponse_group1 reponse_group2 also log odds, so for the example above, the summarise_draws are those numbers still log odds, and should be converted to actual odds? Or are the numbers crude numbers? so could I say 'if I survey a person from group 1 and a person from group 2, the person from group 1 is more likely to give a negative (0.393) and a positive answer (0.104) and less likely to give a neutral answer than the person from group 2?

Apologies for so many Qs and such a long thread! I want to make sure I am understanding this.

I would assume it works the same with a brm_multiple object… if you can call conditional_effects on a brm_multiple object, then fitted() should work fine… I always impute missing data ‘on the fly’ as parameters in the model, so I don’t work with multiple imputation and can’t say for sure, though.

fitted() is an alias of posterior_epred() which “posterior draws of the expected value of the posterior predictive distribution.” In other words, it provides samples of the mean of your brms model. fitted() has options so that you can call it to give summary statistics of the draws, summary=TRUE, or return all of the draws to be used however needed, summary=FALSE. It also has options to return values on the response scale (in your case, proportions), scale="response", or on the scale of the linear predictor (in your case, log-odds), scale="linear".

Yes, all predictors in the model should be included in the new data. You put the other predictors at whatever value or reference level on which you wish to make comparisons.

This depends on the scale= option that you chose for fitted(), as I described above. The default is probably scale="response", which would mean that fitted() is returning the expected (mean) predicted probabilities for each row of the new data. In the code you give, you are returning the draws. So, response_group1 has draws of mean probabilities of negative, neutral, and positive, when group is “1”, and response_group2 has the same for when group is “2”. The line posterior::summarise_draws(reponse_group1 - reponse_group2) is then providing summary statistics for the difference between the mean probabilities for group 1 and 2. When you take the difference you still have vectors of samples, so you need to summarize them to see the mean, median, etc, and that line just does it all at once.

Yes, it looks like that. More specifically, you would say that the estimated mean difference in the mean probability of answering “negative” for Group 1 vs Group 2 is 0.393.

1 Like

Thank you all so much. this makes much more sense than it did at the start!!!