Testing an "omnibus" effect of a variable across the output categories

I’m new to this forum, so I hope I don’t come across as an idiot. :). This seems like such a basic question, but I’m just not confident since I’m new to multinomial models.

It’s a simple situation - one two-level predictor being used to predict which of four categories will be chosen (using brms, family=categorical). I know how to test for a specific response using emmeans and dpar, but is there a way to determine if the predictor had an effect on the distribution of response probabilities (a type of omnibus test) before diving into specific comparisons for each response category?

Here’s some example output

Group-Level Effects: 
~Subject (Number of levels: 40) 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(mu2_Intercept)     1.23      0.15     0.97     1.56 1.00     1524     1944
sd(mu3_Intercept)     0.79      0.10     0.63     1.01 1.00     1728     2334
sd(mu4_Intercept)     1.16      0.14     0.92     1.46 1.00     1521     1893

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu2_Intercept     1.23      0.20     0.86     1.63 1.00     1531     1903
mu3_Intercept    -0.25      0.13    -0.50     0.01 1.01     1816     1980
mu4_Intercept    -0.06      0.19    -0.45     0.31 1.00     1721     1832
mu2_Injury1       1.29      0.19     0.91     1.67 1.00     1494     1795
mu3_Injury1       0.21      0.13    -0.04     0.46 1.00     2073     1687
mu4_Injury1       0.34      0.19    -0.02     0.70 1.00     1610     1979

I thought the solution might be in using the hypothesis command:
hypothesis(modelname, “mu2_Injury1 + mu3_Injury1 + mu4_Injury1 = 0”)

But, I’m not confident that this is doing what I want. Any thoughts?

I have considered the option of just running a model without my predictor and comparing the two models, but I have 16,000 simulated data sets I’ve already analyzed using brms and am not looking forward to running 16,000 more!

I’m running this on a Mac with brms version 2.15.0

I think mu2_Injury1 + mu3_Injury1 + mu4_Injury1 = 0 “tests” the null hypothesis that the sum of these 3 parameters is zero. This is not the same as the null hypothesis mu2_Injury1 = 0 and mu3_Injury1 = 0 and mu4_Injury1 = 0 which I guess is what you mean by omnibus test. In principle, you could inspect the marginal posteriors of these 3 parameters (summarized in your output above under Population-Level Effects) which basically means to use

hypothesis(modelname, c(
  "mu2_Injury1 = 0",
  "mu3_Injury1 = 0",
  "mu4_Injury1 = 0"
))

but that doesn’t include a multiplicity adjustment, so it doesn’t take into account that you’re running 3 separate “tests” and want to control the “family-wise error rate” of having at least one false positive “test” result among them. As far as I know, brms doesn’t offer a multiplicity adjustment. I’m not sure, but I think a very conservative multiplicity adjustment could be achieved by constructing simultaneous 95% posterior intervals based on a 95% highest posterior density (HPD) region. Basically, the idea is as follows (here illustrated with the “Eight schools” example from here, so the content of schools.stan has to be copied over from there):

library(rstan)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

schools_dat <- list(J = 8,
                    y = c(28,  8, -3,  7, -1,  1, 18, 12),
                    sigma = c(15, 10, 16, 11,  9, 11, 10, 18))

fit <- stan(file = "schools.stan", data = schools_dat, seed = 2098606L)
lvl <- 0.95
mat <- as.matrix(fit)
lp_q <- quantile(mat[, "lp__"], probs = 1 - lvl)
mat <- cbind(mat, "in_itvl" = mat[, "lp__"] >= lp_q)
# Names of the parameters to adjust:
pars_adj <- paste0("eta[", seq_len(4), "]")
# Adjusted 95% posterior intervals:
for (pars_adj_i in pars_adj) {
  message(pars_adj_i)
  print(range(mat[as.logical(mat[, "in_itvl"]), pars_adj_i]))
}

Note that these simultaneous 95% posterior intervals which are based on the lp__ values are very conservative as they take all parameters into account and as they “frame” the HPD region. But they might be helpful for a sensitivity analysis nonetheless. For focusing on a subset of parameters (i.e., a multivariate marginal posterior), a generic multivariate kernel density estimation (as implemented, e.g., in Compositional::mkde()) would have to be applied to the draws of this subset of parameters, but that might introduce unstable estimates, especially in the tails where you are interested in (see this post for the univariate case). Furthermore, HPD intervals (in general) are not invariant with respect to parameter transformations (see this post and this thread for the univariate case). Furthermore, for the intervals I calculated above, a multimodal posterior would add even more “conservativeness” as the resulting intervals would “bridge” the gap between the modes.

Some acknowledgments, even if I don’t know if my suggestion above is helpful: I got the idea from an answer on this site.

Thank you for that detailed reply! I’ll have to take some time this weekend to unpack it, but you are absolutely correct that it’s an issue of multiplication (a and b and c) and not simple addition.

I reconmend that you forget about the hypothesis testing and multiplicity adjustment. Instead you can fit a multilevel model; see here: http://www.stat.columbia.edu/~gelman/research/published/multiple2f.pdf

1 Like

Thanks for your thoughts, @andrewgelman! I already knew about your recommendation to fit a multilevel model instead of performing a multiplicity adjustment, but in the present case, I thought it was hard to realize because @MikeKSU would have to get the Stan code from brms, adapt it, and then fit the model with rstan or cmdstanr. But thinking over it again, this is probably not that much harder than my post-hoc solution above.

We had already conducted a multilevel version in which response-class (“Key” in our case) was treated as an FE and RE (like that proposed in the @andrewgelman paper). A portion of our project involves comparing different ways of analyzing this data, including the Bayesian multilevel multinomial approach outlined above. Because the simulations involved comparing a number of frequentist approaches and one Bayesian approach, I was being forced into making claims about the effect of the intervention across all four classes (the “main effect” of key choice) which emerges naturally when response class is treated as a predictor with a RE, but not when response class is used in a multinomial model.