Modeling interaction with dummy variable and control for potential confounding variables


My apologies if this is a simple question, but I’m having difficulty in specifying a model with complex interactions.
Say I want to model a count variable (countVar), with an independent variable - income. I would like to explore whether a dummy variable interacts with income to influence the count variable. The dummy variable may be influenced by the respondent’s gender and marital status.
If I want to control for these potentially confounding factors would I go from this model specification (in brms syntax):
mod <- brm(countVar ~ income + income:dummy + (1 + income | region) , family = poisson …)

To this model:

mod <- brm(countVar ~ income + income:dummy + (1 + income | region) + (dummy | respondentGender:maritalStatus) , family = poisson …)

Or, is there another way I should be thinking of this?


First of all, have you a specific reason to use income + income:dummy instead of income*dummy?

Second, the second model doesn’t look right to me. I don’t know exactly what “influenced” mean for your in this case so you may need to explaine that a little bit more.


Yes, that should have been: income + dummy + income:dummy … or income*dummy

In this instance, income and the dummy seem to have a ‘significant interaction’. This interaction may be genuine, or confounded by other factors such as gender and marital status. I’m not sure of the best way to assess this.

The alternative I had in mind was to have an interaction between income and a factor based on the combination of dummy, respondent gender and marital status.
I’m suspect that there’s a better way of modelling it though.


To just “control” for gender and marietal status assuming no interaction with income or dummy, you could add

+ respondentGender + maritalStatus

to your model formula but I expect you already know that. The baseline is that I believe that “group-level” terms (i.e. the ( x | y ) terms) won’t help you solving this problem.


Ok. I’m probably over thinking this.

Thanks for such a quick response - once again!


I would not consider marital status and gender as random effects. A nice distinction or definition of fixed and random effects is given by Bates (2010) (, first paragraph on page 2:

“Parameters associated with the particular levels of a covariate are sometimes called the “effects” of the levels. If the set of possible levels of the covariate is fixed and reproducible we model the
covariate using fixed-effects parameters. If the levels that we observed represent a random sample
from the set of all possible levels
we incorporate random effects in the model.”

Gender is fixed and reproducible, as well is marital status. “Countries” e.g. would not.

If you really think that the interaction is again moderated by another variable, you could add a three-way-interaction (income * dummy * gender), however, I would not model an interaction with four variables, as the results are hardly interpretable (3-way-interactions still may be plotted / visualized).


That’s a good reference to have - thanks for sharing.