Extremely wide 95% CIs when using custom contrast coding

Hello all,

I am building a multilevel ordinal logistic regression model for some data with 1-7 Likert scale dependent variable (Rating) and 3 categorical independent variables (X = 2 levels, Y = 2 levels, Z = 3 levels).

The model is: brm(Rating ~ X * Y * Z + (1 + X * Y * Z | Item) + (1 + X * Z | Subj), data =df , family = cumulative("logit"), prior = set_prior("normal(0,2)", class = "b"), cores = 4, warmup = 2000, iter = 6000)

X and Y are sum coded and I have set custom contrasts for Z (levels = n, e, and l) using the hypr package to test particular comparisons in factor Z: hypr(None_vs_Early = n~e, None_vs_Late = n~l, Early_vs_Late = e~l

hypr object containing 3 null hypotheses:
H0.None_vs_Early: 0 = n - e
 H0.None_vs_Late: 0 = n - l
H0.Early_vs_Late: 0 = e - l

Hypothesis matrix (transposed):
  None_vs_Early None_vs_Late Early_vs_Late
e -1             0            1           
l  0            -1           -1           
n  1             1            0           

Contrast matrix:
  None_vs_Early None_vs_Late Early_vs_Late
e -1/3             0          1/3         
l    0          -1/3         -1/3         
n  1/3           1/3            0     

When I look at the output from describe_posterior though, all of my CIs for Z are super wide.

Parameter   Median CI  CI_low CI_high   ESS Rhat
Intercept.1. -4.11811 95 -4.5132  -3.742  2229    1
Intercept.2. -2.94517 95 -3.3151  -2.594  1996    1
Intercept.3. -2.12786 95 -2.4858  -1.781  1918    1
Intercept.4. -1.24716 95 -1.5938  -0.897  1862    1
Intercept.5.  0.08861 95 -0.2466   0.443  1830    1
           X  1.31585 95  0.8674   1.772  3887    1
       Z-nVe  0.60277 95 -1.6509   2.850 10962    1
       -nVl   0.58116 95 -1.7331   2.755 11061    1
       -eVl  -0.00787 95 -2.2469   2.224 10937    1
           Y -0.24842 95 -0.8779   0.402  2203    1
     X.Z-nVe -0.19365 95 -2.4201   2.250 11564    1
      .Z-nVl -0.26712 95 -2.6509   1.965 11745    1
      .Z-eVl -0.07948 95 -2.4587   2.197 11635    1
         X.Y  0.89522 95  0.0615   1.737  4043    1
      -nVe.Y -0.31110 95 -2.5902   2.027 12310    1
       nVl.Y  0.00225 95 -2.3935   2.256 12311    1
       eVl.Y  0.32271 95 -1.9900   2.624 12083    1
   X.Z-nVe.Y -0.04146 95 -2.4771   2.237 13391    1
    X.-nVl.Y -0.39658 95 -2.6858   1.949 13435    1
    X.-eVl.Y -0.38375 95 -2.7340   1.986 12792    1

I tried changing the contrasts by dropping 3rd comparison of “e vs l” in factor Z and then I ran the model again and got much more sensible values for my Z coefs given what the data acctually look like:

Parameter  Median CI  CI_low CI_high   ESS Rhat
Intercept.1. -4.0966 95 -4.4872 -3.7167  1777    1
Intercept.2. -2.9268 95 -3.2950 -2.5698  1632    1
Intercept.3. -2.1144 95 -2.4784 -1.7701  1570    1
Intercept.4. -1.2377 95 -1.5756 -0.8827  1557    1
Intercept.5.  0.0880 95 -0.2465  0.4378  1549    1
           X  1.3051 95  0.8484  1.7474  2681    1
       Z-nVe  0.5773 95  0.3594  0.8199  8474    1
       Z-nVl  0.5813 95  0.3424  0.8128  8339    1
           Y -0.2585 95 -0.9052  0.4093  1033    1
     X.Z-nVe -0.2037 95 -0.6316  0.2153  9453    1
     X.Z-nVl -0.2745 95 -0.6692  0.0964 11942    1
         X.Y  0.8777 95  0.0451  1.7566  2069    1
     Z-nVe.Y -0.3194 95 -0.7726  0.1039  8596    1
     Z-nVl.Y  0.0122 95 -0.4413  0.4510  7940    1
   X.Z-nVe.Y -0.0553 95 -0.8372  0.7428  8963    1
   X.Z-nVl.Y -0.4382 95 -1.1803  0.2715 10730    1

Does anyone have any idea what may be going on here? I’m new to running these kinds of models and I’ve never tried setting these kinds of contrasts before.

  • Operating System: Linux Mint 19
  • brms Version: 2.10.0

Welcome to the forums!

The contrasts are a linear combination of each other. None_vs_Late - None_vs_Early = Early _vs_Late. This introduces multicollinearity in the model. That means that the parameters on the contrasts are not identified by the data, just by the priors.

That would also explain why the results are more sensible when you drop one of the contrasts. If you want an estimate for Early_vs_Late parameters, you can run the model with two contrasts, extrac the parameters for these two contrasts and then calculate the difference between the parameters. I hope that makes sense.

1 Like