Contrasts for distributional parameters in a shifted log normal distrbution


I’m playing around with shifted log normal distributions (for reaction time data), trying to get my head around the parameter estimates and also how to do contrasts with them using the emmeans package (which from brms v2.13 and later, has its own emmeans support according to the documentation. The output of the model is as below,

Screen Shot 2020-08-18 at 7.42.12 pm

Now, the Intercept value makes sense as exp(5.21) gives 183 ms. Similarly, the ndt_Intercept makes sense too as exp(5.05) gives 156 ms. But, the sigma_Intercept doesn’t make sense as exp(-0.97) because 0.38 ms is highly improbable, but it does make sense as exp(5.21-0.97), which gives 69.4 ms. My questions are as follows then,

  1. Is the above understanding of sigma_Intercept as exp(5.21-0.97) correct?

  2. If so, is there a way to re-parameterize the model such that the sigma_Intercept estimate outputs 4.24 (5.21-0.97) directly as this would make it a lot easier to use emmeans(model_fit, pairwise ~ factor, dpar = "sigma"). Currently, it picks up only the -0.97…

  3. On a slightly related note, the summary of the model_fit tells us that the mu parameter has an identity link function. Given this, the type = "response" argument for the emmeans function doesn’t exponentiate the estimates (and doesn’t give a ratio for the contrasts). This is a small point and I know I can do this myself but I was wondering if there was a way to do this?

Thank you for reading.

some of the response depends on the exact formula (and potentially other code) you used to fit your model, could you share those, as well as the complete model output?

Not really. First, focusing too much on the estimate can be misleading, the reported uncertainty is also important. Second, if I am guessing correctly, sigma_Intercept is just the intercept of the linear model for sigma, the interpretation of the intercept cannot be made without understanding how the other covariates interact with it.

Generally, with such complex models, there will be a ton of interactions between the model parameters and I am not sure emmeans will handle it well. If you are interested only in the log mean (i.e. you don’t care about sampling variability), then focusing only on the main formula for mean might be enough and you can ignore sigma (as that just represents the sampling variability and that it is not the same for all groups). If you need to do anything more complex, I usually find it better to just use posterior_predict, posterior_linpred or posterior_epred to generate predictions for a suitably chosen data, representing a hypothetical future experiment (say all combinations of trial and bias with average age). And then look at the comparisons of interest in those predictions. This is a bit harder as it forces you to think about a lot of details, but those details can matter, so being explicit about them is IMHO beneficial

I wrote about this elsewhere, i.e Inferences marginal of random effects in multi-level models - #3 by martinmodrak and Multivariate model interpreting residual correlation and group correlation - #6 by martinmodrak , but that was in a quite different context so feel free to ask for clarifications!

Best of luck with your model!

Hi @martinmodrak, thank you for your response.

could you share those, as well as the complete model output?

Surely, these are as below,

What are your thoughts now based on the above?

  1. Is sigma_Intercept interpreted as as exp(5.21-0.97) or is there another interpretation?

  2. If so, is there a way to re-parameterize the model such that the sigma_Intercept estimate outputs 4.24 (5.21-0.97) directly as this would make it a lot easier to use emmeans(model_fit, pairwise ~ factor, dpar = "sigma") . Currently, it picks up only the -0.97… @paul.buerkner any insights?

  3. For anyone else wondering about the 3rd question, @Russ_Lenth answered this here.

Thank you for the suggestions of posterior_predict, posterior_linpred or posterior_epred too, I’ll look into that and the links you’ve shared as well.

Well, I have to say I am confused by these questions. I think the parameters are being misinterpreted. As I read the output, the three parts of the model predict as follows:

  • mu = 5.21 + 0.41*age_group1 + 0.01*bias_type1 – 0.22*trial_type1
  • log(sigma) = -0.97 + 0.15*age_group1
  • log(ndt) = 5.05 + 0.03*age_group1

Now, question 1 refers to the quantity 5.21 – 0.97, which is the sum of the intercepts of two different models. I have no idea why this would be relevant to anything; they aren’t even on the same scale. In Q.2, the model would estimate sigma to be exp(-0.97) for the first age group (age group 0?) and exp(-0.97 + 0.15) = exp(-0.82) for age group 1. If you add type = "response" to that emmeans() call, those are the estimates you will get. The difference between the means is 0.15 on the log scale, meaning that with type = "response", the sigma for age group 1 will be about 16% higher (exp(0.15) ~= 1.16) for age group 1 than the first age group.

1 Like

BTW, I’m now not sure if the answer I am quoted in in Q.3 is correct. It depends on how mu and ndt are to be interpreted. It may not be at all appropriate to force a log link on the mu model.

1 Like

Thanks for your response @rvlenth.

  1. With regards to the first question, I understand your response and the use of type = "response" in the emmeans call but exp(-0.97) is 0.379. If this is on the response scale (which is milliseconds), 0.379 milliseconds for the first age group (age group is a categorical variable here) seems unrealistic in regards to sigma (i.e., the standard deviation of the distribution). As does exp(-0.82) or 0.440 ms for the second age group…

  2. Regarding the mu parameter, I’m not completely sure either but this website here suggests to exponentiate it.

Well, these are things I can’t say much about., but I am quite sure I interpreted the model coefficients correctly. What I don’t know, and what you need to find out, is what is meant by the parameters mu, sigma, and ndt. I know that, often, a lognormal distribution is parameterized by \mu and \sigma, but that these do not refer to the mean and SD of the response Y, but rather the mean and SD of \log Y. If that’s the case, then the mean of Y is \exp(\mu + \frac12\sigma^2). Perhaps there is a similar parameterization here. And then there is the meaning of the shifted part of the distribution; I suspect it is an additive shift and if so, the mean is \theta+\exp(\mu + \frac12\sigma^2) where \theta is the shift parameter. I am sure it is documented in the brms package, and I could look it up, but I think I should leave that to you.

1 Like

Thanks for the context. I think @rvlenth wrote a lot of what I wanted to say, while I was preparing my answer - I basically describe the same stuff, but in a bit different language (and I didn’t want to delete my draft, so I am posting it as well :-).

I fear you are misunderstanding multiple things about how brms (and linear models in general) work. Your chosen response distribution shifted_lognormal is constructed as ndt + exp(w) where w ~ normal(mu, sigma). So mu and sigma refer to mean and sd on the log scale, before shifting. Since mean on the log scale is not bounded, the link is identity - that’s just how this distribution expects it’s input to behave. However sigma needs to be positive, hence the log link (so what is modelled is the logarithm of the sd on the log scale, before shift).

For each of these parameters, you did setup a separate linear model. But (for this response distribution) all of the parameters influence posterior mean, hence all model coefficients influence posterior mean. They are quite hard to interpret in isolation. Also the posterior has uncertainty and it is possible that the posterior has correlations (e.g. largermu implying lower ndt). And this is where using predictions can be handy, because predictions take all of this into account.

In your specific case, the posterior uncertainty is small, so ignoring it will not make your inferences meaningless, but it is still bad parctice.

Just to extend that, one way to roughly interpret this is that for the reference group (i.e the reference age_group, bias_type and trial_type) you would expect 95% of responses to lie between 5.05 + exp(5.21 - 2*exp(-0.97)) and 5.05 + exp(5.21 + 2*exp(-0.97)) with mean 5.05 + exp(5.21 + 0.5 * exp(-0.97)^2. But I repeat, this is a bit hacky and I would really recommend using the functions for model predictions which should do all the necessary math for you. I’ve never used emmeans myself, so I am not sure how much “magic” they can do, but I wouldn’t expect the package to handle those less common brms models well.

Also all the above only holds if your model is a good fit for your data, which you should verify with posterior predictive checks.

Good luck with your model!

1 Like

4 posts were merged into an existing topic: Integrating emmeans with brms

Thank you so much @rvlenth and @martinmodrak for pointing me in the right direction! All of it makes so much more sense now.

I did a bit more reading and now know how the parameters from the model relate to the mean and the variance of the distribution (this website was also great in that regard). I read the posterior_epred and posterior_predict functions and also saw how I can use conditional_effects (specifically, with the method argument to change between the two). My main reason for asking about how the parameters relate exactly was so that I can use the hypothesis function to get Bayes Factor for my predictors of interest, like shown here. I can then report the BF for the predictor as well as the mean/median difference and the uncertainty around it. I’m probably not fully there yet but feel like I have a lot to work with…