We study 15 non-control treatment_a and 10 non-control treatment_b. Not all 150 pairs of treatments have been assigned, only a subset of 48.

In general, we fit “random effects” models with control as the reference category (see this related thread). In the interest of comparing to a “fixed effects” model, we fit a lm(), but cannot reproduce this in brm(). I suspect it has to do with lm() dropping all but the 48 interactions with pairs in the data, whereas brm() appears to include all 150 ?

Thank you !

fit_lm = lm(y ~ treatment_a + treatment_b + treatment_a:treatment_b, data = dat)
coefs_lm = summary(fit_lm)$coefficients
interactions_lm = grep(":", rownames(coefs_lm), value = TRUE)
length(interactions_lm) # 48 pairs of treatment actually seen in the data
# fits fine:
fit_brms_additive <- brm(y ~ treatment_a + treatment_b, data = dat)
# does not converge (Rhat > 3 even with 10000 iterations):
fit_brms <- brm(y ~ treatment_a + treatment_b + treatment_a:treatment_b, data = dat)
coefs_brms = summary(fit_brms)$fixed
interactions_brms = grep(":", rownames(coefs_brms), value = TRUE)
length(interactions_brms) # 150, all possible pairs including those not in the data

I would start by specifying a reasonable regularizing prior on the coefficients. I think the default coefficient prior is flat, which means that levels that don’t receive likelihood contributions are completely free to wander around aimlessly and give the sampler problems. A weakly regularizing prior would fix that.

very helpful ! now brm() converges with all those interactions. :)

Next I will try to understand why brm() even with an extremely weak prior gives 5% “statistically significant” interactions (the Type I error rate), while lm() gives 18% “statistically significant” interactions !

EDIT: this is a denominator issue ! brm() and lm() almost exactly agree on the number of “statistically significant” interactions, but I was taking the denominator for brm() to include the interactions not even in the data likelihood.

Do you mind sharing a little more information about your study and the outcome variable here?

Are the point estimates for the coefficients all between -0.05 and 0.07? If so, it seems like the effect sizes are just very small. Depending on your sample size, lm() could just be finding some of these effects as significant when they are not. I’d guess that the p-values for those “significant” fixed effects would be just barely smaller than 0.05 – these near-0.05 results are actually more likely to occur under the null than alternative hypothesis for adequately powered analyses.

Assuming that you’re using the 95% credible interval as the determination of a “significant” result in the brms model, the fact that about 5% of the resulting intervals do not include 0 would suggest that these are indeed spuriously “significant” results consistent with the nature of the 95% interval.

How are these treatment variables coded and what contrast coding are you using? I assume that they are unordered factors – dat$treatment_a <- factor(...). By default, dummy variable coding is used so that the intercept is the expectation of dat$y when both treatment_a == 0 & treatment_b == 0 (i.e., when the factors are at their reference level). If all of this is the case, then it would be really helpful to know more about your outcome variable y because coefficients ranging between -0.05 and 0.07 would suggest that the difference in the expected y value for any of these treatment groups is less than a tenth of a unit. That difference may meaningful or it may not – it depends on the scale of the outcome.

Since this seems to be a standard linear regression, you can think of those coefficients as the differences in the conditional means of the treatment groups. When both treatment_a and treatment_b are at their reference levels, this is the intercept estimate and can be interpreted as the average y value for the group defined by these reference levels. If the b_treatment_a1 is 0.02, then this means that being in the second level of treatment_a is associated with having a mean y value that is 0.02 units larger than the reference (first) level of treatment_a. Again, depending on the scale of your y variable, that may or may not be a meaningful mean difference.

As you have already found out, regularizing priors make a huge difference in this case. There are, however, some other considerations to keep in mind about whether or not the model is appropriate for your question. Have you run any posterior predictive checks or assumption checks to ensure that there is not an issue with the linear regression model itself? In particular, I’d worry about collinearity with the interaction since the additive model converges with flat priors but needed regularization to converge when the interaction was added. It’s also possible that the sample size is too small for the number of effects being estimated, though. Regarding the possibility of collinearity, a better direct comparison to lm() may be to use the QR decomposition in brms since my understanding is that this is done automatically in lm(). You could try something like the following:

Note that treatment_a * treatment_b is just a shorter way of implementing treatment_a + treatment_b + treatment_a:treatment_b. Also, the specification of the family = ... argument is not necessary here, but I like to specify those kinds of things for transparency. Also, if the 0 + Intercept or -1 + Intercept notation is not used, then there is a separate prior over the intercept. I believe that the default prior on the intercept is student_t(3, 0, 2.5), though I may be wrong about that. Additionally, without the above specification, the intercept estimated by brms is based on a centered design matrix. Depending again on the scale of your y variable, it is possible that neither the default student_t(3, 0, 2.5) or normal(0, 10) priors are appropriate for the intercept (you’re basically saying that you think the most probable mean value for the reference levels of the two treatments is 0 or something close-ish to 0).