Identification and Cumulative Probit models

mattwilliamson13 · March 13, 2023, 5:13pm

Hi All,
I’ve been following @Solomon’s post on priors for cumulative probit models and attempting to implement in them in brms using the index notation (similar to @richard_mcelreath’s approach in Statistical Rethinking). Thanks again to @Solomon, there’s a post for how to do that, as well. Although the models ultimately fit (sample successfully, no divergences), they are extremely inefficient. That has me thinking that I need to treat at least one of my factor variables as a factor (rather than an index) to aid in the identification of the rest (there are 4 factor variables and 2 monotonic predictors in the model). I’d love anyone’s thoughts on whether the index variable approach is appropriate for ordinal models using the cumulative probit and whether all factor variables need to be specified as factors (rather than indices) or if it is enough (at least theoretically) to just specify one. Thanks in advance and huge thanks to @Solomon for these helpful resources!

Solomon · March 13, 2023, 5:43pm

Thanks for your interest in my post. Others have DM’d me about identification issues about some of the of the models in that post. Namely, the last model used this formula:

bf(rating | thres(gr = item) ~ 1 + male + (1 | id) + (1 | item)) +
    lf(disc                    ~ 0 + male + (1 | id) + (1 | item),
       # don't forget this line
       cmc = FALSE)

If memory serves, it’s probably not a good idea to allow the the discrimination parameter to vary by item. A better approach might be something like this:

bf(rating | thres(gr = item) ~ 1 + male + (1 | id) + (1 | item)) +
    lf(disc                    ~ 0 + male + (1 | id),
       # don't forget this line
       cmc = FALSE)

This topic has been on m to-do list for a while, and it’s going to remain there for a while yet. But I do indeed plan on walking through the issue more carefully at some point, and the post will get updated once I do.

mattwilliamson13 · March 13, 2023, 5:53pm

Thanks @Solomon - I’ve been using the following and I want to make sure that this approach for using index variables isn’t totally on the wrong track…

bf(adjrating  ~  1  + country + issue  + age + gender + ideology + ed,
                         country ~ 0 + (1|country),
                         issue ~ 0 + (1|issue),
                         age ~ 0 + age_scl,
                         gender ~ 0 + (1|gender),
                         ideology ~ 0 +  mo(ideology),
                         ed ~ 0 + mo(education),
                         nl = TRUE) +  
              lf(disc ~ 0 + (1|country),
                 cmc = FALSE)

I have something like 22,000 respondents and 12 items, so it didn’t seem feasible to let disc vary by id…

Solomon · March 13, 2023, 6:06pm

If you have 12 items and each participant rated all 12 of the items, I’d find a way to allow at least some of your parameters to vary by item. Otherwise your model is presuming the items behave identically, which seems like a strong assumption. Also assuming all participants responded to 12 items, I’d find a way to at least allow the mean structure to vary by id, but possibly the discrimination model, too.

Solomon · March 13, 2023, 6:10pm

Aside: You might want to add the #brms tag to this post so other brms users might more easily find it.

mattwilliamson13 · March 13, 2023, 6:38pm

Thanks! I’ll work on that. Would it make sense to treat the country variable as a factor rather than an index? My understanding of your notes is that by setting a variable as a factor, the result is to fix that value’s mean at 0. With the index variable approach, that’s no longer true - correct? In the absence of that, it’s not clear to me where the index variables are drawing their overall mean from. Maybe that isn’t the issue and the models are just slow because there is so much data and so many factor variables.

Solomon · March 13, 2023, 6:40pm

If your participants are nested within countries, you might want to fit a 3-level model, rather than a cross-classified 2-level model.

Topic		Replies	Views
Partial pooling on ordinal response (cumulative probit) brms ordinal-response	16	211	May 23, 2025
Item Response Model definition in brms for cumulative logit link Modeling specification , brms	1	630	August 20, 2023
Cumulative models for multiple likert items brms	25	4090	January 29, 2020
Conditional monotonicity in cumulative probit model brms ordinal-response	2	406	December 21, 2023
Question about family = cumulative in brms Modeling ordinal-response , priors , brms	5	1749	April 4, 2023

Identification and Cumulative Probit models

Related topics