How can I combine credible intervals when there are differences in slope terms from interactions?

I would like to calculate credible intervals for the slopes of a model with continuous by categorical interactions. However, in R output, I get a measure of slope for the baseline for one factor and the differences for the remaining factors. E.g. in the below model using the iris data I would like to build a credible interval around the slope for versicolor and virginica.

I asked on the stats stackexchange if this is possible and it seems to be. “If you have posterior draws of the parameters a and b, then merely adding them together c=a+b serves as a posterior draw from the sum distribution. With these posterior draws of c, a credible interval is calculated as usual in the Bayesian setting.”

library(brms)
fitiris <- brm(Petal.Length ~ Petal.Width * Species, data = iris)
summary(fitiris)

Here’s the full output from the model to save you running it:

Population-Level Effects: 
                              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept                         1.33      0.13     1.08     1.58 1.00     1707     2237
Petal.Width                       0.55      0.48    -0.40     1.50 1.00     1618     2287
Speciesversicolor                 0.45      0.39    -0.32     1.22 1.00     2107     2349
Speciesvirginica                  2.91      0.41     2.10     3.70 1.00     1798     2597
Petal.Width:Speciesversicolor     1.33      0.56     0.23     2.44 1.00     1486     1821
Petal.Width:Speciesvirginica      0.10      0.51    -0.88     1.09 1.00     1473     1775
1 Like

You can use one of the as_draws() helper functions (I like as_draws_df()) to extract the posterior draws for the parameters in your model. Then you can manually compute the slopes you desire by using the model formula. E.g.,

library(tidyverse)

as_draws_df(fitiris) %>% 
  mutate(my_versicolor_slope = b_Intercept + b_Speciesversicolor,
         my_virginica_slope = b_Intercept + b_Speciesvirginica)

Then you can use any number of convenience functions to summarize those new posteriors (e.g., mean(), quantile(), tidybayes::median_qi()).

4 Likes

@Solomon - in a similar vein to this, if I have random effects of a categorical variable, are the coefficients labelled as ‘r_whatevereffect’ in the draws data frame the deviation from the overall/average estimated effect of that variable? So for example, if I wanted to know the estimate of some specific level of a random effect, I could simply do something like?:

as_draws_df(ref_model) %>% 
 mutate(average_effect = b_Intercept + b_Overalleffect,
        specific_randomeffectv1 = b_Intercept + b_Overalleffect + r_Specificsubeffect, # random effect is deviation from overall effect
        specific_randomeffectv2 = b_Intercept + r_Specificsubeffect # random effect is self-standing estimate
)

So the key question here for judging how to add the effects is whether the r_effect components of the posterior are deviations from the average effect or rather represent a direct estimate for that sub effect, which are averaged to make the overall effect?

Regardless of whether you have a categorical predictor, “the coefficients labelled as ‘r_whatevereffect’ in the draws data frame [ARE] the deviation from the overall/average estimated effect of that variable.”

Say you have a simple intercepts-only multilevel model using the Gaussian likelihood

\begin{align*} y_{ij} & \sim \mathcal N(\mu_{ij}, \sigma_\epsilon) \\ \mu_{ij} & = \alpha + u_i \\ u_i & \sim \mathcal N(0, \sigma_u), \end{align*}

where your criterion y varies across i people and j measurement occasions. The grand mean intercept is \alpha and the person-level deviations are captured in the u_i term. When you pull your posterior samples with as_draws_df(), those u_i deviations are in the ‘r_whatevereffect’ columns.

Thanks Solomon! To avoid making mistakes when adding stuff together like this, is there a simpler way of getting some fitted estimates? I recall the function add_fitted_draws, but now that is deprecated and there are many variations of it that start to get rather confusing to follow. For example, is there a way to get the full posterior distribution just for specified levels e.g., feed it a data dataframe that indicates the Gender, Age, and specific item in the multilevel model that you want an estimate for, and it returns the fitted estimates for that combination of features without having to calculate them by adding the different posterior segments up?

Depending on what you’re looking for, spend some time looking through the ranef(), coef(), and fitted() functions. You’re probably interested in coef() and fitted().

1 Like

Thanks again, after checking these out it is possible that fitted() is the way to go. This is apparently akin to add_epred_draws().

But what is the difference between this:
add_epred_draws() adds draws from expectation of the posterior predictive distribution to the data

and this?:
add_predicted_draws() adds draws from posterior predictive distribution to the data.

Someone else will have to chime in. I don’t tend to use those functions.

1 Like

Look at the brms documentation for posterior_epred() (to which add_epred_draws() corresponds to) and the documentation for posterior_predict() (to which add_predicted_draws() corresponds to).
posterior_epred() provides the expected value with uncertainty only in the mean incorporated, where as posterior_predict() provides predictive distribution with residual error. The means should be similar but the variance higher for posterior_predict()

1 Like

Just coming back to this answer Solomon to clarify a point. I was interested in looking at the slope estimates but your answer uses b_Intercept. I presume in my example it would be rather Petal.Width + Petal.Width:Speciesversicolor and Petal.Width + Petal.Width:Speciesvirginica

Ah, good clarification point. You’re right. It muddied the waters when I included the intercept in my answers, which is the kind of thing you do when expressing an expected value. I believe your suggestions would be correct.

1 Like