Wide CIs in last level when using "contr.sum"

When using sum contrasts for factors, the conditional effect (from conditional_effects()) of the last level in the factor always has a particularly wide credible interval (see figure for an example). This happens independent of data and model.

Can someone shed light on this? (Might be related to this, but I am not sure: Extremely wide 95% CIs when using custom contrast coding)

[brms 2.12.0, R 3.6.2, macOS 10.15.4]

Hi @stefanoc88.

Welcome to the forum! This is an interesting question and it would be helpful if you provided more information on the data you are working with or a reproducible example (e.g. simulation). I tried to reproduce this with a small simulation on my own system (windows 10, R 3.6.2 brms 2.12.0) but could not reproduce this ( find the example here: example_contr_sum.R (4.1 KB) ).
As you can see in the plot below, if simulated equally large the different CIs should indeed also be equally large with sum-to-zero coding.


Can you provide more information about the data and models you are working with?



Hi @julianquandt!

Thanks for your reply! Actually, I realised now that that happens only with sample_prior = "only" (sorry for the mistake, I’ll update the question EDIT: I don’t seem to be able to edit the Q).

For example with the following (adding to your script):

m <- brm(
  y ~
    x_f +
    (1 | participant) +
    (1 | stimulus),
  data = exp1, cores = 4,
  sample_prior = "only",
  prior = prior(normal(0, 100), class = b)

now the CI of 4 is wider.

So my question is, why is it that when only sampling the prior the last level in a factor with contr.sum has a wider CI?



I haven’t had time to go through the code, but I’d imagine by ‘last level’, it is the level that is coded as -1? In that case, there is no prior placed on that level (because the coefficients of the other levels determine the coefficient of the last reference level). Thus, when sampling from the prior predictive distribution, there is much more uncertainty attributed to that level because it is a combination of all the prior predictive distributions of the other levels combined. I’m not 100% sure though.


What @cgoold says. If you explicitly exclude the intercept from the model y ~ 0 + x_f all the factor levels have the same CI-width because it will be directly estimated:


Thanks both, that makes perfect sense!