Estimating two sets of thresholds in an ordinal model

Imagine I fit an ordinal model to the responses of one group of subjects (Group A), given on a 7-point scale, in which their responses are predicted by a continuous Predictor:

m.groupA <- brm(Response ~ Predictor, data = ratings.groupA,
      family = cumulative(link="probit"))

In this model, the positions of 6 thresholds would be estimated ( Intercept[1], Intercept[2], etc.) plus a slope for Predictor.

A similar model is then fitted to the responses of another group of subjects (Group B), and a different set of (6+1) estimates is obtained.

Imagine further that the slope for Predictor, as well as the distances between threshold positions look quite different in the two Groups.

I would then like to fit the following model to compare Predictor in the two Groups:

m.interaction <- brm(Response ~ Predictor * Group, data = ratings.combined,
      family = cumulative(link="probit"))

However, the distances between threshold positions are now estimated for the combined dataset (so far as I understand it), and I can no longer capture that the two sets of thresholds are quite differently placed in the two groups.

What would be the best way to account for such differences?

(I tried the model Response ~ 0 + Group, hoping to get 12 intercepts - 6 thresholds times 2 groups, but this produces an error)

1 Like

Just to clarify, youā€™re trying to model a situation where the cutpoints differ between the two groups and the affect of Predictor varies between the groups?

Paul Buerknerā€™s paper on ordinal models briefly mentions this on page 20 and how to do it in brms, it can be found here https://psyarxiv.com/x8swp/ . This model will be pretty hard to identify without strong priors though. I was having trouble getting the model to fit some generated data with vague priors.

Your formula in brms would look more like this if I understood you correctly. The family would have to be changed to one of the other ordinal families like acat, cratio or aratio. The vingette at vignette(ā€œbrms_overviewā€) has an example modeling category specific effects too which you should check out.

bf(y ~ (cs(1)+ cs(x)|| group))

Iā€™ve never tried to fit a model like this so I donā€™t think I can help you much more than that. Hereā€™s the generated data I used,

N <- 400
group <- rbinom(N, 1, .5)
N1 <- sum(group==1)
u1 <- rlogis(N1)
u2 <-  rlogis(N-N1)
x1 <- rnorm(N1, 0, 1)
x2<-  rnorm(N-N1, 0, 1)
ys1 <- x1 + u1
ys2 <- 1.5*x2 + u2
mu1 <- c(-Inf,-2,0,2, Inf)
mu2 <- c(-Inf,-1.5,0,.5, Inf)

y1 <- cut(ys1, mu1, labels=c(1,2,3,4), ordered_result = TRUE)
y2 <- cut(ys2, mu2, labels=c(1,2,3,4), ordered_result = TRUE)

df1 <- data.frame("y" = y1,"x"= x1, "ys" =  ys1, "group" = rep(1, N1), "cuts" = rep(c(-2,0,2), N1))
df2 <- data.frame("y" = y2, "x" = x2, "ys" = ys2, "group" = rep(0, N-N1),"cuts" = rep(c(-1.5,0,.5), N-N1))

data <- rbind(df1,df2)


ggplot(data, aes(x = x, y = ys)) +
  scale_color_manual(values=c("red", "blue", "green","orange" ))+
  geom_hline(aes(yintercept = cuts),
             color = "blue",  linetype="dashed")+
  geom_point(aes(fill = y, shape = factor(group)), size = 3,pch=21) +
  facet_grid(~group)
1 Like

That is correct. The idea is that the relative distances between thresholds may differ between groups, say, with one group being more reluctant to give extreme answers. The effect of the predictor can plausibly vary between groups, but I would like to estimate it (and the Predictor * Group interaction) while ā€œadjustingā€ for such biases ā€¦ if that makes sense.

Fascinating, thanks a lot! (I seem to have missed that part in B&Vā€™s paper.)

The problem I see with this solution is that Group would need to be a random effect. Two possible consequences are: (a) by-group intercepts would be estimated with partial pooling (which may be good); and (b) it may be hard (or senseless) to estimate a random parameter when Group has only two levels (which is bad).

I guess ideally Iā€™d like something like what cs(1) is doing there, but for a fixed effect of Group and in a cumulative model.

A little more exploration tells me what I want may be the estimation of nominal effects, in which the thresholds are allowed to depend on regression variables.

Using your code from above to generate data, @Reece_W, I fitted the following model with ordinal::clm():

clm(y ~ 1 + x * group, nominal=~group, data=data)

, which shows:

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  1200 -1292.99 2601.97 6(1)  5.06e-14 3.8e+01

Coefficients: (1 not defined because of singularities)
        Estimate Std. Error z value Pr(>|z|)    
x         1.5558     0.1054  14.765  < 2e-16 ***
group         NA         NA      NA       NA    
x:group  -0.3754     0.1382  -2.715  0.00662 ** 
---
Signif. codes:  0 ā€˜***ā€™ 0.001 ā€˜**ā€™ 0.01 ā€˜*ā€™ 0.05 ā€˜.ā€™ 0.1 ā€˜ ā€™ 1

Threshold coefficients:
                Estimate Std. Error z value
1|2.(Intercept) -1.73907    0.11939 -14.566
2|3.(Intercept) -0.08490    0.09408  -0.903
3|4.(Intercept)  0.44500    0.09547   4.661
1|2.group       -0.40481    0.17998  -2.249
2|3.group       -0.10088    0.13494  -0.748
3|4.group        1.25495    0.15407   8.145

This works great. The last three estimates show the differences between threshold positions for group 1 vs. group 0, and the x:group interaction shows the difference between the two slopes of x. (the main effect of group cannot be estimated, which I suppose makes sense given that the thresholds are allowed to differ in the two groups)

Can such ā€œnominal effectsā€ be estimated with brms?

Yea after reading what you said it makes more sense. I didnā€™t think the random-effects would necessarily make sense either but thatā€™s how it was referenced in some of the BRMS documentation.

I wrote the model out on paper and itā€™s exactly how itā€™s listed in the documentation of the ordinal package.

edit: I figured it outā€¦ itā€™s simple to do and in the documentation but in a different spot.

brm(y | thres(gr = group) ~ x + x:group,          
  data = data, 
  family = cumulative())

Which recovers the parameters from the generated data with the default priors.

Population-Level Effects: 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[0,1]    -1.42      0.07    -1.56    -1.29 1.00     1508     1264
Intercept[0,2]    -0.10      0.06    -0.22     0.02 1.00     2598     1617
Intercept[0,3]     3.06      0.11     2.85     3.29 1.00     1527     1419
Intercept[1,1]    -2.15      0.08    -2.31    -1.99 1.00     1244     1353
Intercept[1,2]    -1.05      0.06    -1.17    -0.92 1.00     1992     1658
Intercept[1,3]    -0.13      0.06    -0.24    -0.01 1.00     2365     1134
x                  1.42      0.06     1.30     1.55 1.00     1269     1339
x:grou1           -0.44      0.09    -0.61    -0.28 1.00     1223     1353

Family Specific Parameters: 
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00 1.00     1800     1800

edit 2:

I did some more test with different sample sizes and itā€™s not recovering the interaction term x:group without a large sample n=1000 unless you set even vague priors like a student-t, in which case you need much less samples.

2 Likes

This is perfect, seems to work wonderfully.
Many thanks!

For completeness: The only difference from the summary of the clm() model is that the Intercepts are estimated for both groups (which was my original request anyway), rather than differences between groups. In any case, direct comparisons of the threshold locations could always be obtained by subtracting the posterior samples (or using hypothesis()).

1 Like

Can you point toward the documentation, @Reece_W?

As a follow-up on this discussion, will the brm() function successfully parse this syntax if the responses within the subgroups do not fully encompass the full range of possible responses? For instance, imagine that there is a likert scale with 4 possible responses, but some subgroups use only 2 or 3 of the responses. Maybe this is a question for @paul.buerkner.

In my case, Iā€™m working with 5 items that were answered separately by 20+ different groups. Owing to possible variation in the translation of the items, it would be ideal if the thresholds could vary for each group-item combination. But the number of responses that were provided is heterogeneous across the group-items.

And when I try to fit a model like this:

brm ( y | thres(gr = site_item) ~ 1 + (1|site_code), data = ld, family = cumulative(ā€œprobitā€))

then literally nothing happens. The function stops after approximately 1 second, with no object created, and the R console defaults to its carriage return. By contrast, the brm function works fine when no separate thresholds are specified.

Edit: The brm() function works if I switch the thresholds from ā€œsite_itemā€ to ā€œsiteā€, which substantially reduces the number of thresholds to be estimated. Perhaps that is important to consider. With this newer model, I see that the brms code successfully detects the varying number of thresholds when the full scale isnā€™t used by a subgroup.

Given that you want to capture things like reluctance to use extreme options, a better approach than modeling differences in thresholds would be to model differences in discrimination (variance of the latent variable). A group that is reluctant to use extreme options would have smaller variance/discrimination and so be more clumped in the center option.

brm(
  bf(y ~ group + x + x:group,
     disc ~ group,
  data = data,
  family = cumulative()
)

Good feedback, @bwiernik. In principle, one could seemingly model variation in thresholds alongside differences in discrimination, no?

That would be difficult to identify I would expect, but you could try it out

1 Like