Interpret back-transformed parameters for Lognormal and Weibull models

Hello :)

I’m seeking for clarity and confidence to interpret back-transformed parameters and report them in a paper.

Here, I compare 3 toys models (gaussian, lognormal, weibull) to understand the differences in interpretation :

library(brms)
library(posterior)


model1_gaussian <- brm(Force_mean_baseline ~ 1 + Condition +Consigne + (1|Subject),  
              data = data_Bague_time1, 
              warmup = 3000, iter = 7000, 
              cores = 2, chains = 2, 
              seed = 123) #to run the model

model1_lognormal <- brm(Force_mean_baseline ~ 1 + Condition +Consigne + (1|Subject),  
                        data = data_Bague_time1, 
                        family=lognormal(),
                        warmup = 3000, iter = 7000, 
                        cores = 2, chains = 2, 
                        seed = 123) #to run the model

model1_weibull <- brm(Force_mean_baseline ~ 1 + Condition +Consigne + (1|Subject),  
                      data = data_Bague_time1, 
                      family=weibull(),
                      warmup = 3000, iter = 7000, 
                      cores = 2, chains = 2, 
                      seed = 123) #to run the model

samples1_gaussian <- as_draws_array(model1_gaussian, variable = "^b_", regex = TRUE)
samples1_lognormal <- as_draws_array(model1_lognormal, variable = "^b_", regex = TRUE)
samples1_weibull <- as_draws_array(model1_weibull, variable = "^b_", regex = TRUE)

samples2_lognormal <- mutate_variables(samples1_lognormal, exp_b_Intercept = exp(b_Intercept), exp_b_Conditioncond1 = exp(b_Conditioncond1))
samples2_weibull <- mutate_variables(samples1_weibull, exp_b_Intercept = exp(b_Intercept), exp_b_Conditioncond1 = exp(b_Conditioncond1))

summarise_draws(samples1_gaussian)
summarise_draws(samples2_lognormal)
summarise_draws(samples2_weibull)

The results of the summaries :

summarise_draws(samples1_gaussian)
# A tibble: 8 × 10
  variable          mean median     sd    mad    q5   q95  rhat ess_bulk ess_tail
  <chr>            <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
1 b_Intercept      0.823  0.823 0.166  0.160  0.555 1.10   1.00     393.     739.
2 b_Conditioncond1 1.62   1.62  0.0548 0.0550 1.53  1.71   1.00    4623.    4970.
3 b_Conditioncond3 0.870  0.871 0.0534 0.0535 0.782 0.956  1.00    4810.    4881.
4 b_Consigne150    0.532  0.534 0.0764 0.0769 0.406 0.656  1.00    3226.    5122.
5 b_Consigne200    0.845  0.846 0.0751 0.0746 0.721 0.969  1.00    2621.    4662.
6 b_Consigne250    1.25   1.25  0.0748 0.0748 1.13  1.37   1.00    2988.    5168.
7 b_Consigne300    1.64   1.64  0.0756 0.0756 1.51  1.76   1.00    2843.    4634.
8 b_Consigne350    2.06   2.06  0.0760 0.0764 1.93  2.18   1.00    3110.    4559.

summarise_draws(samples2_lognormal)
# A tibble: 10 × 10
   variable                 mean   median     sd    mad      q5    q95  rhat ess_bulk ess_tail
   <chr>                   <dbl>    <dbl>  <dbl>  <dbl>   <dbl>  <dbl> <dbl>    <dbl>    <dbl>
 1 b_Intercept          -0.00933 -0.00888 0.0513 0.0511 -0.0926 0.0744  1.00     633.    1133.
 2 b_Conditioncond1      0.561    0.561   0.0172 0.0173  0.533  0.590   1.00    4622.    4940.
 3 b_Conditioncond3      0.341    0.341   0.0168 0.0169  0.313  0.369   1.00    4850.    5328.
 4 b_Consigne150         0.304    0.304   0.0243 0.0246  0.264  0.343   1.00    3050.    3544.
 5 b_Consigne200         0.453    0.452   0.0242 0.0238  0.412  0.492   1.00    2975.    3828.
 6 b_Consigne250         0.628    0.628   0.0243 0.0246  0.589  0.667   1.00    3032.    4172.
 7 b_Consigne300         0.764    0.764   0.0243 0.0244  0.724  0.804   1.00    2958.    3971.
 8 b_Consigne350         0.898    0.898   0.0242 0.0243  0.858  0.938   1.00    2817.    3759.
 9 exp_b_Intercept       0.992    0.991   0.0509 0.0508  0.912  1.08    1.00     633.    1133.
10 exp_b_Conditioncond1  1.75     1.75    0.0301 0.0304  1.70   1.80    1.00    4622.    4940.

summarise_draws(samples2_weibull)
# A tibble: 10 × 10
   variable              mean median     sd    mad     q5   q95  rhat ess_bulk ess_tail
   <chr>                <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
 1 b_Intercept          0.108  0.106 0.0541 0.0525 0.0229 0.200  1.00     502.     897.
 2 b_Conditioncond1     0.574  0.574 0.0158 0.0158 0.548  0.599  1.00    4613.    4832.
 3 b_Conditioncond3     0.366  0.366 0.0150 0.0148 0.342  0.391  1.00    4537.    5588.
 4 b_Consigne150        0.256  0.256 0.0202 0.0201 0.222  0.289  1.00    3543.    4351.
 5 b_Consigne200        0.408  0.408 0.0203 0.0199 0.374  0.441  1.00    3356.    4577.
 6 b_Consigne250        0.545  0.545 0.0203 0.0205 0.512  0.578  1.00    3207.    4444.
 7 b_Consigne300        0.699  0.699 0.0204 0.0207 0.665  0.732  1.00    3266.    4778.
 8 b_Consigne350        0.836  0.836 0.0204 0.0208 0.803  0.870  1.00    3390.    4888.
 9 exp_b_Intercept      1.12   1.11  0.0607 0.0582 1.02   1.22   1.00     502.     897.
10 exp_b_Conditioncond1 1.78   1.77  0.0281 0.0280 1.73   1.82   1.00    4613.    4832.

I found clues in 3.5 Posterior predictive distribution | An Introduction to Bayesian Data Analysis for Cognitive Science

For the lognormal model

- For the response variable :

If I understand correctly, the median of the log-normal distribution is exp(μ) and the mean is exp(μ+σ2/2). Hence, to estimate the median or mean of my response variable, I consider all the right hand formula of my model.

  1. I see the simplicity to report the median, however, in non-statisticians fields, don’t we expect the mean ?

- For specific explicative variables :

If I exponentiate parameters, I can interpret it as a multiplicative effect.

  1. Using the posterior package with as_draws_array(), mutate_variables() with exp() and summarise_draws() I get the mean and median of the median of the parameter, right ? Again, even if they might be close, do we not expect to report the mean and 95% IC of each parameter in a paper?

For the Weibull model

For the Lognormal model, the link function is identity whereas for the Weibull model the link function is log and there is a also a change with the shape parameter instead of sigma.

  1. Does it make a difference in interpretation to back-transform parameters with a simple exponential, compare to the LogNormal model ?

I’m confused with all the concepts, do not hesitate to ask me for clarifications.

Do you know where I could find explanations on these distinctions ?

Thank you so much for your time !

Alex

1 Like

For a log-normal distribution, exp(\mu) is the geometric mean. I never realised it was also the median. Anyway that doesn’t really matter. And yes, exp(\mu+\sigma^2 / 2) is the arithmetic mean (i.e. the metric we normally refer to as “mean”).

To compare the arithmetic mean of the three fitted distribution types, yes you need to transform some of the parameters. For the log-normal, do as discussed above. For the Weibull, it is simply the exponent of the linear predictor. For the Gaussian case, your model should directly tell you the arithmetic mean. I’m saying this without having read your code very closely. Let me know if this is not the confirmation you needed.

1 Like

Thank you so much for your quick answer @LucC !!

I’m surprised it is that easy to report back-transformed estimates for a Weibull distribution but let it be so :)

In my answer, I’m assuming that the linear predictor in your Weibull model is a linear predictor for the log of the mean, and not the scale. Sorry, I didn’t check. If the linear predictor is the log of the scale parameter of the Weibull distribution, the mean is: scale * gamma(1 + 1/shape). Here, “gamma” is the gamma function.

Again, thank you so much for your quick answer !

I don’t see where I can get this scale parameter in my weibull model :

> model1_weibull
 Family: weibull 
  Links: mu = log; shape = identity 
Formula: Force_mean_baseline ~ 1 + Condition + Consigne + (1 | Subject) 
   Data: data_Bague_time1 (Number of observations: 3439) 
  Draws: 2 chains, each with iter = 7000; warmup = 3000; thin = 1;
         total post-warmup draws = 8000

Group-Level Effects: 
~Subject (Number of levels: 42) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.33      0.04     0.27     0.42 1.00      691     1346

Population-Level Effects: 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept          0.11      0.05     0.00     0.22 1.00      502      897
Conditioncond1     0.57      0.02     0.54     0.60 1.00     4613     4832
Conditioncond3     0.37      0.01     0.34     0.40 1.00     4537     5588
Consigne150        0.26      0.02     0.22     0.30 1.00     3543     4351
Consigne200        0.41      0.02     0.37     0.45 1.00     3356     4577
Consigne250        0.54      0.02     0.51     0.58 1.00     3207     4444
Consigne300        0.70      0.02     0.66     0.74 1.00     3266     4778
Consigne350        0.84      0.02     0.80     0.88 1.00     3390     4888

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape     3.00      0.04     2.92     3.08 1.00     5517     5637

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

For example, if I want to back-transform the parameter Conditioncond1, should I do something like :

samples2_weibull <- mutate_variables(mean_b_Conditioncond1 = b_Conditioncond1*gamma(1+1/shape))
  1. The link function for mu is log but for shape it is identity. Don’t I need an exponentiate somewhere for mu ? What would be the transform if shape’s link function is also log ?

  2. Do you know if there is any package or function doing the back-transform automatically ?

Thank you so much for your time !

Hi Alexandre, the line above tells me that your model is parameterised in terms of the mean (i.e., the linear predictor is the log of the mean) and the shape. My last message only applied in case your model was parameterised in terms of scale and shape. So for your model, if you exponentiate the intercept, you will get the mean of whatever the reference group is in your model. Exponentiating the coefficients of the “condition” and “consigne” parameters will give you the relative difference between those groups and the reference (for every unit change in the predictor).

For reference, Wikipedia often has useful information on the parameters and summary statistics of various distributions, like the Weibull distribution. Here you can see e.g. the relation between shape, scale, mean, and variance

Thank you very much, @LucC !! :)