# Different posterior-based ways for describing the effect size?

Storytelling with data is extremely important. I am trying to more discover the flexibility of a posterior for presenting an effect size. What are the different possibilities for showing a posterior-based effect size (point estimate + credible intervals)?

I give few examples (group A and B weight difference) that I found out by myself.

Data, model and posterior

``````library(tidyverse)
df = bind_rows(
tibble(weight = rnorm(100, 80, 5), group = rep("A", length.out = T)),
tibble(weight = rnorm(100, 60, 5), group = rep("B", length.out = T)),
)

library(brms)
fit = brm(weight ~ (1 | group), df)

posterior = posterior_samples(fit) %>% clean_names() %>% mutate(A = b_intercept + r_group_a_intercept, B = b_intercept + r_group_b_intercept) %>% select(A, B)
``````

Option 1: showing difference in natural scale (group A - group B)

`posterior %>% mutate(difference = A-B) %>% posterior_summary()`

``````           Estimate Est.Error     Q2.5    Q97.5
A          79.00722 0.5201468 77.97750 80.01733
B          60.00173 0.5233391 58.97731 61.02931
difference 19.00550 0.7388529 17.55313 20.41760
``````

Option 2: showing difference in times (group A/group B)

`posterior %>% mutate(difference = A/B) %>% posterior_summary()`

``````           Estimate  Est.Error      Q2.5     Q97.5
A          79.00722 0.52014681 77.977501 80.017334
B          60.00173 0.52333915 58.977314 61.029313
difference  1.31685 0.01441019  1.289337  1.344716
``````

Option 3: showing difference as percentage ((A-B)*100)/A)

`posterior %>% mutate(difference = ((A-B)*100)/A) %>% posterior_summary()`

``````           Estimate Est.Error     Q2.5    Q97.5
A          79.00722 0.5201468 77.97750 80.01733
B          60.00173 0.5233391 58.97731 61.02931
difference 24.05209 0.8311302 22.44078 25.63488
``````

Are these correct? What are other cool possibilities that I could use for showing the difference (effect size)?

1 Like

They appear “correct” in the sense of “implemented correctly”. Whether they are “correct” for answering a specific question depends heavily on the question you are asking, so I don’t think there is a general answer. I think most often difference in natural scale would be the easiest to interpret. It also matches the model closely - when one talks about relative multiplicative change, one usually wants to work with models whose responses are constrained to be positive (as otherwise relative difference is a weird measure), so I would expect those primarily with something like lognormal or Poisson model. But I can imagine scenarios where either Option 2 or Option 3 is most useful even with a normal model.

I’ll note that here you seem to be replicating the functionality of `posterior_epred` - that’s a good exercise to make sure you understand the model, just wanted to highlight that this functionality is already available in `brms` out of the box.

In more complex models, you often get a lot of valid prediction tasks for the same model (e.g. which covariates to keep fixed, which to change; which random effects to include; …) and the interesting part is thinking which prediction task (and thus which effect size) is the most relevant to your question.