Hello :)
I’m seeking for clarity and confidence to interpret back-transformed parameters and report them in a paper.
Here, I compare 3 toys models (gaussian, lognormal, weibull) to understand the differences in interpretation :
library(brms)
library(posterior)
model1_gaussian <- brm(Force_mean_baseline ~ 1 + Condition +Consigne + (1|Subject),
data = data_Bague_time1,
warmup = 3000, iter = 7000,
cores = 2, chains = 2,
seed = 123) #to run the model
model1_lognormal <- brm(Force_mean_baseline ~ 1 + Condition +Consigne + (1|Subject),
data = data_Bague_time1,
family=lognormal(),
warmup = 3000, iter = 7000,
cores = 2, chains = 2,
seed = 123) #to run the model
model1_weibull <- brm(Force_mean_baseline ~ 1 + Condition +Consigne + (1|Subject),
data = data_Bague_time1,
family=weibull(),
warmup = 3000, iter = 7000,
cores = 2, chains = 2,
seed = 123) #to run the model
samples1_gaussian <- as_draws_array(model1_gaussian, variable = "^b_", regex = TRUE)
samples1_lognormal <- as_draws_array(model1_lognormal, variable = "^b_", regex = TRUE)
samples1_weibull <- as_draws_array(model1_weibull, variable = "^b_", regex = TRUE)
samples2_lognormal <- mutate_variables(samples1_lognormal, exp_b_Intercept = exp(b_Intercept), exp_b_Conditioncond1 = exp(b_Conditioncond1))
samples2_weibull <- mutate_variables(samples1_weibull, exp_b_Intercept = exp(b_Intercept), exp_b_Conditioncond1 = exp(b_Conditioncond1))
summarise_draws(samples1_gaussian)
summarise_draws(samples2_lognormal)
summarise_draws(samples2_weibull)
The results of the summaries :
summarise_draws(samples1_gaussian)
# A tibble: 8 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 b_Intercept 0.823 0.823 0.166 0.160 0.555 1.10 1.00 393. 739.
2 b_Conditioncond1 1.62 1.62 0.0548 0.0550 1.53 1.71 1.00 4623. 4970.
3 b_Conditioncond3 0.870 0.871 0.0534 0.0535 0.782 0.956 1.00 4810. 4881.
4 b_Consigne150 0.532 0.534 0.0764 0.0769 0.406 0.656 1.00 3226. 5122.
5 b_Consigne200 0.845 0.846 0.0751 0.0746 0.721 0.969 1.00 2621. 4662.
6 b_Consigne250 1.25 1.25 0.0748 0.0748 1.13 1.37 1.00 2988. 5168.
7 b_Consigne300 1.64 1.64 0.0756 0.0756 1.51 1.76 1.00 2843. 4634.
8 b_Consigne350 2.06 2.06 0.0760 0.0764 1.93 2.18 1.00 3110. 4559.
summarise_draws(samples2_lognormal)
# A tibble: 10 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 b_Intercept -0.00933 -0.00888 0.0513 0.0511 -0.0926 0.0744 1.00 633. 1133.
2 b_Conditioncond1 0.561 0.561 0.0172 0.0173 0.533 0.590 1.00 4622. 4940.
3 b_Conditioncond3 0.341 0.341 0.0168 0.0169 0.313 0.369 1.00 4850. 5328.
4 b_Consigne150 0.304 0.304 0.0243 0.0246 0.264 0.343 1.00 3050. 3544.
5 b_Consigne200 0.453 0.452 0.0242 0.0238 0.412 0.492 1.00 2975. 3828.
6 b_Consigne250 0.628 0.628 0.0243 0.0246 0.589 0.667 1.00 3032. 4172.
7 b_Consigne300 0.764 0.764 0.0243 0.0244 0.724 0.804 1.00 2958. 3971.
8 b_Consigne350 0.898 0.898 0.0242 0.0243 0.858 0.938 1.00 2817. 3759.
9 exp_b_Intercept 0.992 0.991 0.0509 0.0508 0.912 1.08 1.00 633. 1133.
10 exp_b_Conditioncond1 1.75 1.75 0.0301 0.0304 1.70 1.80 1.00 4622. 4940.
summarise_draws(samples2_weibull)
# A tibble: 10 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 b_Intercept 0.108 0.106 0.0541 0.0525 0.0229 0.200 1.00 502. 897.
2 b_Conditioncond1 0.574 0.574 0.0158 0.0158 0.548 0.599 1.00 4613. 4832.
3 b_Conditioncond3 0.366 0.366 0.0150 0.0148 0.342 0.391 1.00 4537. 5588.
4 b_Consigne150 0.256 0.256 0.0202 0.0201 0.222 0.289 1.00 3543. 4351.
5 b_Consigne200 0.408 0.408 0.0203 0.0199 0.374 0.441 1.00 3356. 4577.
6 b_Consigne250 0.545 0.545 0.0203 0.0205 0.512 0.578 1.00 3207. 4444.
7 b_Consigne300 0.699 0.699 0.0204 0.0207 0.665 0.732 1.00 3266. 4778.
8 b_Consigne350 0.836 0.836 0.0204 0.0208 0.803 0.870 1.00 3390. 4888.
9 exp_b_Intercept 1.12 1.11 0.0607 0.0582 1.02 1.22 1.00 502. 897.
10 exp_b_Conditioncond1 1.78 1.77 0.0281 0.0280 1.73 1.82 1.00 4613. 4832.
I found clues in 3.5 Posterior predictive distribution | An Introduction to Bayesian Data Analysis for Cognitive Science
For the lognormal model
- For the response variable :
If I understand correctly, the median of the log-normal distribution is exp(μ) and the mean is exp(μ+σ2/2). Hence, to estimate the median or mean of my response variable, I consider all the right hand formula of my model.
- I see the simplicity to report the median, however, in non-statisticians fields, don’t we expect the mean ?
- For specific explicative variables :
If I exponentiate parameters, I can interpret it as a multiplicative effect.
- Using the posterior package with as_draws_array(), mutate_variables() with exp() and summarise_draws() I get the mean and median of the median of the parameter, right ? Again, even if they might be close, do we not expect to report the mean and 95% IC of each parameter in a paper?
For the Weibull model
For the Lognormal model, the link function is identity whereas for the Weibull model the link function is log and there is a also a change with the shape parameter instead of sigma.
- Does it make a difference in interpretation to back-transform parameters with a simple exponential, compare to the LogNormal model ?
I’m confused with all the concepts, do not hesitate to ask me for clarifications.
Do you know where I could find explanations on these distinctions ?
Thank you so much for your time !
Alex