Sum to zero constrain implies balance of groups?

In the Stan User’s Guide I can read the following regarding multiple intercepts in multilevel regression:

it can be more computationally tractable to enforce a sum-to-zero constraint on the coefficients. Other values than zero will by necessity be absorbed into the intercept, which is why it typically gets a broader prior even with standardized data. With a sum to zero constraint, coefficients for binary features will be negations of each other.

What is unclear to me is if this implies an assumption about groups being balanced or not. In other words: if I have group A, B and C and I include an intercept in the model for each group constraining their sum to zero, am I assuming the gropus to be equally sized?

Nope, there’s no need to assume equally sized groups. This just constrains the parameters, which would otherwise not be identified without some kind of constraint (an alternative would be to use a prior or to pin one of the values, as discussed in the User’s Guide.

1 Like