Hypothesis function in logistic regression with brms

Hi everyone

I’m struggling with understanding how I should use the hypothesis function when running logistic regression with brms. Let’s say I would like to find out if incidence of muscle cramps (Yes vs. No) is related to muscle activity in participants’ legs (volts):

Musclecramps ← brm(Cramps ~ LegMuscleActivity + (1|ID),data=df, family=bernoulli(link=‘logit’),prior=set_prior (‘normal(0, 5)’))

How can I define the hypothesis to find out if the muscle activity is related to incidence of muscle cramps?

Is this correct?
h ← hypothesis(
Musclecramps,
“LegMuscleActivity > Intercept”
)

Thanks I appreciate your help - statistics are not my strong side and I’m getting really confused trying out brms!

I think that this is wrong, but I’m really struggling how to define this.

Based on what you posted, my take-away here is that you want to know whether having more leg muscle activity corresponds to a greater probability of having a cramp. If that is correct, then you don’t need to worry about testing whether the predictor (LegMuscleActivity) is greater than the Intercept.

To break down the model a little more, you’re getting a fixed and random intercept for the incidence of a cramp. I might think of these as just a general baseline expectation for having a cramp, and to this prediction model, we collect some extra data (LegMuscleActivity) that we think can help to predict a cramp. To test then whether this extra information helps us predict the outcome, we might test for whether the coefficient for this variable is different than zero (this is your traditional null hypothesis testing framework). In your case, this sounds like a directional hypothesis: we expect that higher leg muscle activity conveys a greater probability of having a leg cramp. So, to test this in brms’ hypothesis() function, you’d want something like this:

h <- hypothesis(Musclecramps,
                "LegMuscleActivity > 0"
               )

The function is testing for whether the coefficient (b_LegMuscleActivity) is greater than 0 (i.e., that it’s effect is not credibly zero). Comparing to the intercept doesn’t really make too much sense and could be misleading depending on the scale of your predictor. For example, if I was predicting an outcome on a 0-50 point scale but had a predictor that was on a 100,000-1,000,000 point scale, then the coefficient for this predictor will be very small (we’re having to convert a large value to a smaller predicted one). As a result of this scaling issue, the intercept may have a larger magnitude (i.e., is a larger number) than the predictor, but this doesn’t mean that the predictor doesn’t have a larger effect or is “more important” than the intercept. That’s less of an issue when everything is standardized going into the regression, but even if scaling isn’t a concern, the comparison of a coefficient to an intercept needs to have an appropriate hypothesis justifying the comparison.

I’ll also just throw out that there are other ways of looking at your effect aside from the hypothesis() function. Within brms, you could examine the results of conditional_effects(Musclecramps) to get a really nice visualization of how differences in LegMuscleActivity relates to differences in the probability of a cramp. You could also take a fairly simple approach and examine the credible intervals for the coefficient of interest: if zero falls within this credible interval, then you might conclude that the effect is “non-significant” as some credible probability is assigned to the effect being zero or very close to zero. I don’t recommend the approach, but it’s just an alternative to mention.

You could also look into some of the convenience functions from the bayestestR package. In particular, it can be helpful to visualize the probability of direction (pd()) and regions of practical equivalence (rope()). The probability of direction summarizes the proportion of the total posterior distribution of an effect that has the same sign as the distribution’s median, so this corresponds fairly closely to the traditional p-value. A large probability of direction can provide support for the presence of an effect (i.e., the majority of the posterior distribution has the same sign/is different than 0). The next question is then whether the effect is meaningful, and that’s where the region of practical equivalence (ROPE) comes in. The ROPE examines the proportion of the posterior distribution for an effect estimate that is 0 +/- some additional value reflecting what you believe to be a “negligible” effect.