Quadratic relationship between ordinal predictor and categorical outcome

@paul.buerkner

I’m working on my master’s thesis using the brms package, and I have a question about modeling in this context.

My predictor variable is ordinal and measures workload on a 5-point scale. My outcome variable is an ordered categorical variable with 3 categories. The relationship between the workload and the outcome appears to be exponential, with a peak at the midpoint (3 points at workload).

I want to model this relationship but am concerned about how to handle the workload variable in the model. If I treat the workload variable as an ordered factor, the model estimates polynomials of higher orders than 2, which might not be ideal. On the other hand, I prefer not to convert the workload variable into a numeric scale.

Is there a way to model this using brms that respects the ordinal nature of the outcome variable and accurately captures the non-linear relationship with the workload predictor, without resorting to high-order polynomials?

With this I get higher polynomials as output
test_workload_f ← brm(
formula = category ~ factor(workload) + (1 + factor(workload) || individual + time),
data = sim_data,
family = cumulative(“logit”),
iter = 2000,
chains = 2,
warmup = 500,
cores = parallel::detectCores()
)

I also tried to make the workload score nummeric (but the outcame also dont look the way I want):
sim_data$workload ← as.numeric(sim_data$workload)
formula ← bf(category ~ workload + I(workload^2) + (1 + workload + I(workload^2) || individual + time))

priors ← c(
prior(normal(-1, 1), class = “b”, coef = “IworkloadE2”), # this should be negative
prior(normal(3, 1), class = “b”, coef = “workload”), # The highest point is around workload of 3
prior(normal(1, 1), class = “Intercept”) # This should be positive
)

test_workload_qudratic ← brm(formula,
data = sim_data,
family = cumulative(),
prior = priors, iter = 2000,chains = 2, warmup = 500,
cores = parallel::detectCores()
)

Thanks a lot!

You might be looking for something like the monotonic effect argument for the predictor variable: Estimating Monotonic Effects with brms

It allows for predictors to have increasing impact at different points along their scale. I would consider whether incoroporating such an effect gets you what you are after. I also don’t think it would be terrible to treat the predictor as categorical, if you had to - certainly seems better than numeric.

1 Like

I agree with @JimBob. Monotonic effects might be what you’re looking for. In addition to the vignette he liked (which is a great reference), I’ve also worked through and example of an ordinal model with an ordinal predictor here.

I may be misreading this but I don’t think what is being described here is a monotonic relationship. It feels like “exponential” is a misnomer for a hump-shaped relationship. Outcome is highest at mid workload, decreases with higher workload. @Charlotte003, is that right?

Yes that is right. That is also the reason why I can not use monotonic model. The relationship is a U shape. With the highest point at 3. So it increases from 1 to 3 and than decreases again.

1 Like

I’ve been wondering about this. There are cases where there is an ordinal predictor and we don’t want to assume that levels are equidistant but at the same time, domain knowledge cannot preclude a u-shaped relationship. What are our options?

In my field (ecology), “too much of a good thing” is so common, that even if one could show that a relationship could only be saturating but not reversing, the use of a monotonic effect would raise eyebrows.

@Charlotte003 if the relationship is as you describe, then the estimates for higher polynomials are close to zero with wide intervals right? I don’t think that they can have much of an influence on model predictions (check conditional_effects(test_workload_f)). You could make their priors very narrow with zero mean. But is that necessary?