Get the difference between two conditions in the raw scale from a model with a lognormal likelihood

Hi,
I’m trying to figure out the way to get the posterior of the difference between two trials in milliseconds from a model that was fit using a lognormal likelihood. I can extract the draws and do it “manually” but I was wondering if there is “brms” way that I’m missing.

This is my data:

df_spacebar <- structure(list(rt = c(141L, 138L, 128L, 132L, 126L, 134L, 163L, 
149L, 133L, 110L, 111L, 145L, 173L, 165L, 141L, 132L, 125L, 134L, 
157L, 166L, 163L, 165L, 165L, 165L, 173L, 165L, 173L, 158L, 140L, 
139L, 142L, 140L, 133L, 143L, 140L, 136L, 134L, 146L, 165L, 157L, 
156L, 149L, 124L, 141L, 141L, 165L, 157L, 150L, 140L, 157L, 164L, 
166L, 162L, 173L, 182L, 190L, 187L, 172L, 164L, 164L, 142L, 164L, 
141L, 148L, 149L, 140L, 157L, 158L, 139L, 146L, 148L, 149L, 151L, 
165L, 148L, 152L, 150L, 167L, 156L, 133L, 161L, 152L, 165L, 165L, 
179L, 181L, 189L, 189L, 205L, 199L, 170L, 173L, 189L, 213L, 198L, 
179L, 180L, 206L, 195L, 188L, 180L, 188L, 214L, 178L, 168L, 157L, 
157L, 157L, 147L, 165L, 157L, 147L, 158L, 181L, 189L, 198L, 180L, 
189L, 196L, 194L, 172L, 157L, 141L, 137L, 156L, 148L, 152L, 145L, 
156L, 148L, 157L, 181L, 189L, 173L, 164L, 172L, 149L, 172L, 165L, 
149L, 173L, 172L, 148L, 157L, 150L, 147L, 165L, 156L, 165L, 165L, 
172L, 165L, 168L, 181L, 164L, 156L, 164L, 157L, 142L, 185L, 172L, 
165L, 165L, 149L, 126L, 156L, 188L, 189L, 181L, 172L, 168L, 163L, 
159L, 153L, 156L, 149L, 172L, 156L, 165L, 149L, 164L, 173L, 179L, 
188L, 183L, 161L, 157L, 146L, 143L, 154L, 156L, 149L, 139L, 157L, 
174L, 157L, 158L, 166L, 184L, 197L, 164L, 188L, 156L, 164L, 165L, 
175L, 170L, 154L, 173L, 188L, 141L, 157L, 173L, 192L, 188L, 181L, 
197L, 173L, 190L, 204L, 212L, 221L, 181L, 181L, 156L, 180L, 182L, 
187L, 204L, 173L, 175L, 177L, 180L, 284L, 117L, 148L, 157L, 140L, 
164L, 172L, 197L, 166L, 163L, 166L, 165L, 139L, 141L, 163L, 165L, 
173L, 166L, 180L, 147L, 157L, 181L, 189L, 204L, 184L, 168L, 189L, 
183L, 165L, 165L, 189L, 206L, 201L, 212L, 189L, 173L, 168L, 206L, 
195L, 197L, 181L, 192L, 173L, 181L, 182L, 148L, 155L, 141L, 173L, 
204L, 181L, 213L, 205L, 181L, 174L, 174L, 153L, 164L, 204L, 165L, 
175L, 185L, 171L, 173L, 167L, 168L, 172L, 173L, 188L, 204L, 192L, 
181L, 180L, 173L, 149L, 175L, 409L, 133L, 173L, 205L, 260L, 173L, 
182L, 172L, 166L, 154L, 174L, 179L, 166L, 188L, 247L, 153L, 149L, 
157L, 180L, 167L, 192L, 172L, 173L, 180L, 180L, 172L, 172L, 158L, 
155L, 165L, 157L, 165L, 165L, 177L, 188L, 197L, 182L, 188L, 161L, 
150L, 146L, 156L, 181L, 191L, 179L, 186L, 214L, 182L, 179L, 177L, 
183L, 162L), trial = 1:361, c_trial = c(-180, -179, -178, -177, 
-176, -175, -174, -173, -172, -171, -170, -169, -168, -167, -166, 
-165, -164, -163, -162, -161, -160, -159, -158, -157, -156, -155, 
-154, -153, -152, -151, -150, -149, -148, -147, -146, -145, -144, 
-143, -142, -141, -140, -139, -138, -137, -136, -135, -134, -133, 
-132, -131, -130, -129, -128, -127, -126, -125, -124, -123, -122, 
-121, -120, -119, -118, -117, -116, -115, -114, -113, -112, -111, 
-110, -109, -108, -107, -106, -105, -104, -103, -102, -101, -100, 
-99, -98, -97, -96, -95, -94, -93, -92, -91, -90, -89, -88, -87, 
-86, -85, -84, -83, -82, -81, -80, -79, -78, -77, -76, -75, -74, 
-73, -72, -71, -70, -69, -68, -67, -66, -65, -64, -63, -62, -61, 
-60, -59, -58, -57, -56, -55, -54, -53, -52, -51, -50, -49, -48, 
-47, -46, -45, -44, -43, -42, -41, -40, -39, -38, -37, -36, -35, 
-34, -33, -32, -31, -30, -29, -28, -27, -26, -25, -24, -23, -22, 
-21, -20, -19, -18, -17, -16, -15, -14, -13, -12, -11, -10, -9, 
-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 
42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 
58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 
74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 
90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 
105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 
118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 
131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 
144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 
157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 
170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180)), row.names = c(NA, 
-361L), class = c("tbl_df", "tbl", "data.frame"))

I fit this model:

fit_press_trial <- brm(rt ~ 1 + c_trial,
  data = df_spacebar,
  family = lognormal(),
  prior = c(
    prior(normal(6, 1.5), class = Intercept),
    prior(normal(0, 1), class = sigma),
    prior(normal(0, .01), class = b, coef = c_trial)
  )
)

And I do the following to get the effect in ms in the middle of the experiment (mean trial vs. mean trial - 1)

alpha_samples <- as_draws_df(fit_press_trial)$b_Intercept
beta_samples <- as_draws_df(fit_press_trial)$b_c_trial
effect_middle_ms <- exp(alpha_samples) -
  exp(alpha_samples - 1 * beta_samples)
c(
  mean = mean(effect_middle_ms),
  quantile(effect_middle_ms, c(0.025, 0.975))
)

And I do this to get the difference between the second and the first trial:

first_trial <- min(df_spacebar$c_trial)
second_trial <- min(df_spacebar$c_trial) + 1
effect_beginning_ms <-
  exp(alpha_samples + second_trial * beta_samples) -
  exp(alpha_samples + first_trial * beta_samples)
## ms effect from first to second trial:
c(
  mean = mean(effect_beginning_ms),
  quantile(effect_beginning_ms, c(0.025, 0.975))
)

I’ve been playing with conditional_effects() but it doesn’t seem to be the right way.

1 Like

Hi Bruno,

I went to use the method I would normally use for doing this. I’m not 100% sure that I’m not making an error but figured it would still be useful for me to write this down. If this is not the same thing then maybe you or someone else can tell me.

I use fitted() and give the model the values of c_trial that I want it to predict for, and then pass the “new data” and the model fit to the predict function.

newdata <- data.frame(c_trial = c(0, 1))

fitted(fit_press_trial,
        newdata = newdata)

Your code gives me the following output:

c(
  mean = mean(effect_middle_ms),
  quantile(effect_middle_ms, c(0.025, 0.975))
)
mean       2.5%      97.5% 
0.08734442 0.06649929 0.10787354 

Mine gives me the following:

fitted(fit_press_trial,
        newdata = newdata)

 Estimate Est.Error     Q2.5    Q97.5
[1,] 168.3658  1.104144 166.2274 170.5635
[2,] 168.4538  1.104480 166.3137 170.6566

The difference between the two predictions is 0.088, so I think I’m extracting the same information. I haven’t figured out how to get the difference between the two rows, but I’m sure that’s not something that is too difficult.

Hope this helps,
Scott

Thanks!
But I do want to get the Mean and 95% CI of the difference between trials and not just the mean and 95 CI of the trials.

You can still get that with the fitted() method. Just set summary = FALSE and you’ll get the full number of posterior draws, from which you can then make a difference score.

newdata <- data.frame(c_trial = c(0, 1))

fitted(fit_press_trial,
       newdata = newdata,
       summary = FALSE)
3 Likes

Thanks!

For completeness, this way I would get the same results as with my first chunk of code showing output:

newdata <- data.frame(c_trial = c(0, 1))
middle <- fitted(fit_press_trial,
                 newdata = newdata,
                 summary = FALSE)
middle_effect_ms_2 <- middle[,2] - middle[,1]
c(
  mean = mean(effect_middle_ms_2),
  quantile(effect_middle_ms_2, c(0.025, 0.975))
)

It’s not much shorter, but it’s a nice alternative.

3 Likes

Incidentally a somewhat related conversation over here:

led to creation of an early draft brmsmargins package on github. Once installed, you could get what you want using:

library(brmsmargins)

## AME for two specific trials (0, 1)
marg <- brmsmargins(
  fit_press_trial,
  at = data.frame(c_trial = c(0, 1)),
  contrasts = cbind(
    "Estimate c_trial = 0" = c(1, 0),
    "Estimate c_trial = 1" = c(0, 1),
    "Estimate c_trial 1 - 0" = c(-1, 1)),
  CI = 0.95, CIType = "HDI")
marg$ContrastSummary

##              M          Mdn           LL          UL PercentROPE PercentMID
##1: 168.26971114 168.26542263 166.14610932 170.4999232          NA         NA
##2: 168.35775319 168.35455232 166.15603648 170.5090509          NA         NA
##3:   0.08804205   0.08793096   0.06797247   0.1081785          NA         NA
##     CI CIType ROPE  MID                  Label
##1: 0.95    HDI <NA> <NA>   Estimate c_trial = 0
##2: 0.95    HDI <NA> <NA>   Estimate c_trial = 1
##3: 0.95    HDI <NA> <NA> Estimate c_trial 1 - 0

This gives you the contrast, on the original metric between c_trial 0 and 1.
Or if you want more of an average marginal effect of the slope, you can use numerical derivatives as below. In this instance, they are fairly trivially different, but may be of use.

## AME for the numerical derivative
marg2 <- brmsmargins(
  fit_press_trial,
  add = data.frame(c_trial = c(0, 1e-5)),
  contrasts = cbind(
    "Estimate AME for slope of c_trial " = c(-1/1e-5, 1/1e-5)),
  CI = 0.95, CIType = "HDI")

marg2$ContrastSummary
##            M        Mdn         LL        UL PercentROPE PercentMID   CI
##1: 0.08815504 0.08803763 0.06801887 0.1083883          NA         NA 0.95
##   CIType ROPE  MID                              Label
##1:    HDI <NA> <NA> Estimate AME for slope of c_trial 

Note that both these solutions would marginalize over the sample distribution of covariates, if you had any covariates.

2 Likes

Ok, I checked and I don’t get the same results:

My original method

alpha_samples <- as_draws_df(fit_press_trial)$b_Intercept
beta_samples <- as_draws_df(fit_press_trial)$b_c_trial
effect_middle_ms <- exp(alpha_samples) -
  exp(alpha_samples - 1 * beta_samples)
## ms effect in the middle of the expt
## (mean trial vs. mean trial - 1)
c(
  mean = mean(effect_middle_ms),
  quantile(effect_middle_ms, c(0.025, 0.975))
)

gives

  mean   2.5%  97.5% 
0.0873 0.0666 0.1086 

and with @Solomon’s method, which should be as follows:

newdata_1 <- data.frame(c_trial = c(-1, 0))
middle <- fitted(fit_press_trial,
                 newdata = newdata_1,
                 summary = FALSE)
fitted_middle_ms <- middle[, 2] - middle[, 1]
c(
  mean = mean(fitted_middle_ms),
  quantile(fitted_middle_ms, c(0.025, 0.975))
)

I get:

  mean   2.5%  97.5% 
0.0880 0.0671 0.1095 

I don’t think that the numerical approximation should work differently, right? In both cases the samples are backtransformed to raw scale and the difference is calculated and then the summary statistics are done. So why am I getting slightly different results?

ok, I modified @Joshua_Wiley’s example so that it matches mine:
with data.frame(c_trial = c(-1, 0)),
and I tried it and I still get the same result as with fitted. How come there is a (small) difference in the output in comparison with my way of doing it?

(Unrelated to that, when would you want the average marginal effect of the slope?)

I don’t have an answer @bnicenboim but the intercept, back transformed, and the fitted values with c_trial = 0 do not match, so that is an even simpler example of not matching that would need to be understood before we could work out why/when differences would/would not agree.

bayestestR::describe_posterior(exp(alpha_samples))

#Summary of Posterior Distribution
#Parameter | Median |           95% CI |   pd |          ROPE | % in ROPE
#------------------------------------------------------------------------
#Posterior | 167.00 | [164.82, 169.17] | 100% | [-0.10, 0.10] |        0%

bayestestR::describe_posterior(
fitted(fit_press_trial, newdata = data.frame(c_trial = 0), summary = FALSE)[,1])


#Summary of Posterior Distribution
#Parameter | Median |           95% CI |   pd |          ROPE | % in ROPE
#------------------------------------------------------------------------
#Posterior | 168.27 | [166.15, 170.50] | 100% | [-0.10, 0.10] |        0%

In regards to your marginal slope question, An Introduction to ‘margins’ gives some nice examples. With nonlinear effects, the marginal slope is essentially the average rate of change across all trials, instead of just the contrasts between two specific trials.

oh this is unsettling:

> fitted(fit_press_trial, newdata= tibble(c_trial = 0))
     Estimate Est.Error Q2.5 Q97.5
[1,]      168      1.11  166   171
> c(mean(exp(alpha_samples)), quantile(exp(alpha_samples), c(.025,.975)))
       2.5% 97.5% 
  167   165   169 
> 

@paul.buerkner , maybe you could take a look? (You don’t need to follow the entire thread, my original post has the data and the model, and how I generated alpha_samples).

ehem, @paul.buerkner, is this a bug? Or am I doing something wrong here?

fitted in brms always returns samples of the mean of the posterior predictive distribution. In the lognormal case the mean is not exp(mu) but exp(mu + sigma^2/2). Hence the difference.

3 Likes

Ohhh yes or course! Im so used to work with the median in log scale that I didn’t realize that the mean would be the most logical default!

1 Like