I’ve got a logistic regression with population-level effects and group-level effects. N equals 2217. Group A is very large (n = 1227), while the other groups are small, with sample sizes ranging from 1 to 46.
What would be the soundest way to model the Group A as a fixed effect while continuing to model the other groups as group-level effects? Simply adding a a dummy variable for Group A to the population-level part seems suboptimal, because then the model will have two perfectly correlated parameters (the fixed and random effect) both referring to the same group.
I already tried creating an alternate version of the grouping factor in which Group A’s value has been set to NA. But this leads brms to discard those rows of data altogether.
An alternative approach which comes to mind is creating a version of the grouping factor in which Group A has been merged with a bunch of “uninteresting groups” (I picked all the ones with n = 1, of which there are 180), and this certainly allows the model to be fit. But the new heterogeneous group still represents largely the patterns of Group A, thus removing credit from its fixed effect.
A better but highly labor-intensive approach coming to my mind is the conflation of Group A with other, hand-selected groups with random effects which collectively cancel out that of Group A, thus removing all overlap between Group A’s new fixed effect and the random-effect category into which Group A has been subsumed. But this would be an enormous hassle, requiring a large number of fits and refits that would take a lot of time.
Is there a simpler, more elegant solution?