Model interpretation and making inference

Dear all,

I just want to make sure that I fully understand my model. I ran the following model to test how the two levels voicing capture differences in the response variable. The response variable is standardised and I’m using contr.sum coding. I’m using the package emmeans for this to contrast this.

m1 <- brm(intensity_offset.y_z ~ position*voicing*target_vowel+poa+rep+
                     (1+position*voicing*target_vowel+poa|File.name)+
                     (1+position|word),
                   data= dat
                    ...)

Here is the model summary

Population-Level Effects: 
                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept                 -0.26      0.23    -0.72     0.20 1.00     5098     9537
voicing1                   0.28      0.07     0.14     0.42 1.00    21990    19843
target_vowel1              0.04      0.09    -0.14     0.22 1.00    23031    21668
target_vowel2             -0.17      0.09    -0.35     0.01 1.00    21877    19885
poa1                       0.13      0.09    -0.05     0.30 1.00    23706    20775
poa2                      -0.09      0.10    -0.30     0.12 1.00    23112    20642
rep                       -0.03      0.03    -0.08     0.02 1.00    59711    25886
voicing1:target_vowel1     0.14      0.08    -0.03     0.31 1.00    23040    20713
voicing1:target_vowel2    -0.01      0.08    -0.18     0.16 1.00    24001    21381

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.43      0.01     0.40     0.46 1.00    25981    25172

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

I used the following code to extract the estimate for each level of voicing using emmeans.


r3 <- emmeans(m1, "voicing")
plot(rope(r3))

This code produces the following estimates and plot.

 voicing    emmean lower.HPD upper.HPD
 voiced     0.0212    -0.431    0.4853
 voiceless -0.5393    -1.035   -0.0397

Point estimate displayed: median 
HPD interval probability: 0.95 

t1

Q1- What inference can I make from these estimates and plot?

The possible answer that I have and want to check whether it is valid is:
a- the estimates and plot show that voiceless category exhibited a decreased effect compared to the voiced category.
b- The difference between the estimates of both category is not reliable due to the substantial overlap of the 95% quantile credible (or HDI) for both distributions. This means differences in the response variable x is not robust captured by the voicing predictor. (Is this a valid conclusion based on the estimate and plot?)

However, when I used the following code to plot the difference/contrast only between both voicing categories using emmeans:

r <- emmeans(m1, ~  voicing)
pairs(r) -> r1
plot(rope(r1))

this code gives me the following estimate and plot.

 contrast           estimate lower.HPD upper.HPD
 voiced - voiceless    0.563     0.287     0.836

Point estimate displayed: median 
HPD interval probability: 0.95

test

As is seen, the difference between both voicing categories seems far from the rope and the 95% HDI of the posterior distribution falls outside the rope.
Q- Doesn’t this plot contradict the second conclusion I drew from the first plot and estimate (i.e., The difference between the estimates of both category is not reliable due to the substantial overlap of the 95% quantile credible (or HDI) for both distributions)?

I’m not sure which estimate and plot makes the correct inference with regard to my goal which looks at whether differences in the response variable are robustly captured by the two levels of the voicing predictor. Am I missing something? Does each estimate and plot provide a different aspect of the model?

Your input is greatly appreciated!

@Dallak hello, it’s often a bit dangerous to use the apparent overlap of distributions or intervals to try and infer something about the difference between the estimates. If you’re primarily interested in the difference between the predicted values between conditions then you should express the posterior distribution for the difference directly. A key benefit of a Bayesian approach is that it’s easy to express uncertainty statements for all sorts of functions of parameters.

Note that of course the difference between the point estimates 0.0212 and -0.5393 is about 0.56, the same as the contrast. The point estimate 0.0212 is above the upper.HPD of -0.0397, so the corresponding HPD for the difference must exclude zero (which it does, just). So these expressions are ultimately equivalent, but which to use is case-dependent. If it’s not clear why the intervals for each estimate shouldn’t be directly compared to each other, consider with reference to the model parameters why these intervals are correlated. A side note is that it’s not really a good idea to make a dichotomous decision about reliability because an interval crosses zero.

I personally like to show the expected value under different conditions with intervals, as in your emmeans case, but where specific differences are important they should be stated directly, say, the difference in outcome between treatments, a risk ratio, etc.

2 Likes

Thank you very much @AWoodward for this helpful answer.

The point estimate 0.0212 is above the upper.HPD of -0.0397 , so the corresponding HPD for the difference must exclude zero (which it does, just)"

Q1- So does it always true that when the point estimate of x predictor is above the upper.HPD or below the lower.HPD of y predictor, this means the HPD for the difference must exclude zero? Should I take this a rule of thumb in such cases?

Q2- Do you think removing the correlation from the model may fix this issue?

Q3- Could you please elaborate a bit on this point?

If you’re primarily interested in the difference between the predicted values between conditions then you should express the posterior distribution for the difference directly.

Thank you again for your time and concerns.

  1. for linear models with simple categorical predictors, yes. But this does not extend to other cases (non-normal response, interactions, etc.). In general the interpretation and presentation of results from models beyond quite simple ones is a non-trivial problem. You might appreciate some of the discussions of those issues here.

  2. sorry if I was unclear; this doesn’t represent a problem to be solved. If your model has two parameters, an intercept \beta_0 and some coefficient for a binary predictor \beta_1, then its predictions are \beta_0 and \beta_0 + \beta_1. For fixed data, an MCMC sample having a small value of \beta_0 will probably have a large value of \beta_1. So the posterior distributions in your example are presented as if they are independent, but they aren’t (the model is a joint distribution for them).

  3. in an RCT for example, I might want to learn how much some outcome is different given a treatment compared to placebo. In a model like yours, I can obtain a posterior prediction of the expected value of the outcome for both treatment and placebo. That’s not a good expression of the effect of treatment; the placebo is there to be compared against, not because the outcome given placebo is of any interest. So it is better to express the difference directly, and this is particularly important when there are other predictors to consider. In a cross-sectional or cohort study the same reasoning could be applied to the risk ratio.

2 Likes

This is very informative. I highly appreciate your help @AWoodward.