Multidimensional IRT in brms

martinkan · February 11, 2020, 2:26pm

Dear Paul,
in your recent excellent paper “Bayesian Item Response Modeling with brms and Stan”, you said that “the non-linear multilevel formula syntax of brms allows for a flexible yet concise specification of multidimensional IRT models”. Could you please give a short example of brms code how to specify a multidimensional model for polytomous data with estimated discrimination? Eg. 10 items in total, 5 of them in one dimension and another 5 in othoer dimension. Thanks a lot. Martin

Please also provide the following information in addition to your question:

Operating System: Windows
brms Version: 2.11.1

paul.buerkner · February 11, 2020, 3:11pm

Hi Martin,

welcome to the Stan forums. Under your circumstances, the 1PL model could look as follows:

bf(y ~ (1 | item) + (0 + itemtype | person),
   family = bernoulli())

where itemtype is a factor assigning the items to the two dimensions. The corresponding 2PL model, could look as follows:

bf(y ~ disc * eta,
   eta ~ (1 | item) + (0 + itemtype | person),
   disc ~ (1 | item),
   nl = TRUE,
   family = bernoulli()
)

martinkan · February 11, 2020, 4:42pm

Dear Paul,
thanks a lot for your reply. Unfortunately, my items are polytomous, as I said previously, so the family will be “cumulative” (or graded response model in IRT terminology). As you have said: “We have to use slightly different formula syntax, though, as the non-linear syntax of brms cannot handle the ordinal thresholds in the way that is required when adding discrimination parameters.” Your example: “formula_va_ord_2pl <- bf(resp ~ 1 + (1 |i| item) + (1 | id), disc ~ 1 + (1 |i| item))”. Should I add (0+itemtype) to (1|id) to obtain (0 + itemtype | id)?

paul.buerkner · February 11, 2020, 4:51pm

Yes, exactly. The key is to replace (1 | id) by (0 + itemtype | id).

jackbailey · April 4, 2020, 5:19pm

I have some experimental data where respondents saw a random political slogan and then reported how it made them feel on a 0-10 scale across 10 emotions which theory suggests load onto three latent factors.

They look as follows:

# A tibble: 94,530 x 7
      id    w8 gender slogan          emotion      resp  group          
   <dbl> <dbl> <fct>  <fct>           <fct>        <ord> <fct>          
 1 97322  1.07 Female Get Brexit done Enthusiastic 5     Positive Affect
 2 97322  1.07 Female Get Brexit done Hopeful      10    Positive Affect
 3 97322  1.07 Female Get Brexit done Proud        5     Positive Affect
 4 97322  1.07 Female Get Brexit done Scared       5     Anxiety        
 5 97322  1.07 Female Get Brexit done Worried      5     Anxiety        
 6 97322  1.07 Female Get Brexit done Afraid       6     Anxiety        
 7 97322  1.07 Female Get Brexit done Hateful      5     Aversion       
 8 97322  1.07 Female Get Brexit done Angry        5     Aversion       
 9 97322  1.07 Female Get Brexit done Bitter       5     Aversion       
10 97322  1.07 Female Get Brexit done Resentful    5     Aversion

I’d like to model how varying the slogan and respondent gender affects their latent emotions.

I fit the following model to a single slogan, which seemed to work well:

brm(
  formula =
    bf(resp | weights(w8) ~ 1 + (1 |i| emotion) + (0 + group | id),
       disc ~ 1 + (1 |i| emotion)),
  family = 
    cumulative(
      link = "probit",
      link_disc = "log"),
  prior = 
    prior(normal(0, 1), class = "Intercept") +
    prior(constant(1), class = "sd", group = "id") +
    prior(normal(0, 3), class = "sd", group = "emotion") +
    prior(normal(0, 1), class = "sd", dpar = "disc") +
    prior(lkj(2), class = "cor"),
  data = 
    dta %>%
    filter(slogan == "Get Brexit done"),
  iter = 2e3,
  chains = 2,
  cores = 2
 )

My question: How can I extend the model to handle covariates like gender that might themselves vary over the slogan that each respondent saw?

paul.buerkner · April 5, 2020, 7:39pm

You could add, for instance, slogan*gender to the predictor term. Or, if you want to use multlevel structure over slogans, add (gender | slogan) instead.

jackbailey · April 9, 2020, 10:44pm

Is there any way to have these vary over group? I.e. as if I were predicting their effect on the 3 different latent variables? (here shown as group).

paul.buerkner · April 10, 2020, 2:38am

They could interact with group of course.

jackbailey · April 10, 2020, 12:48pm

Sorry for all the questions, this is all a little mind-bending — I assume that I also wouldn’t include a main effect in that case.

For example:

bf(resp ~ 1 + (1 |i| emotion) + (0 + group*gender + group*slogan | id),
   disc ~ 1 + (1 |i| emotion))

paul.buerkner · April 10, 2020, 1:01pm

When adding varying effects you almost always want to include the corresponding overall effects.

jackbailey · April 10, 2020, 5:50pm

Ok, thanks. I’m just trying to wrap my head around why there’s no overall effect of group/itemtype in the example you gave above. Was this an oversight?

paul.buerkner · April 10, 2020, 6:10pm

Yeah I missed that sorry.

jackbailey · April 10, 2020, 6:15pm

Ok, cool! Good to know that I wasn’t going crazy!

Famondir · December 5, 2020, 11:03am

Dear multidim IRT modelers,
I have a question about two competing possibilties to spedify a 2PL MIRT model. The first one sfollows the code you showed in your paper about “Analysing Standard Progressive Matrices”, @paul.buerkner:

formula <- bf(
  response ~ beta + exp(logalpha1) * theta1,
  nl = TRUE,
  logalpha1 ~ 1 + (1 | item),
  theta1 ~ 0 + (0 + dimension | person),
  beta ~ 0 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)

I just alterered the (1 | person) to (0 + dimension | person) and dropped the item correlations. The second specification looks like this:

formula <- bf(
  response ~ beta + exp(logalpha1) * map_dim1 * theta1 +
    exp(logalpha2) * map_dim2 * theta2 +
    exp(logalpha3) * map_dim3 * theta3,
  nl = TRUE,
  theta1 ~ 0 + (1 |p| person),
  theta2 ~ 0 + (1 |p| person),
  theta3 ~ 0 + (1 |p| person),
  beta ~ dimension + (1 | item),
  logalpha1 ~ 1 + (1 | item),
  logalpha2 ~ 1 + (1 | item),
  logalpha3 ~ 1 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)

The map_dim terms refer to integer columns in the data and are 1 if the item is related to the dimension and 0 else. I came up with this approach for models where items can have multiple dimensions and where the number of dimensions a item is related to can differ. It works.

When I came back an began to make some comparisms with simulated data on a model that could be described by both models I found the following points: The second approach…

is slower. It takes almoast twice as long.
gives some wide spread estimators for the logalphas that are multiplied by 0 in the main formula (*).
gets a little lower loo elpd.
BUT better recovers the values for the logalpha SDs used to simulate the data.

The first model pools al logalphas so the SD over all logalphas corresponding to one dimension get shrunken to much and the difference in the SD between the dimensions gets shrunken as well. I think this is a drawback and I would like to hear your opion about this.

One possible solution I found after several tries to was:

formula <- bf(
  response ~ beta + exp(logalpha1) * theta1,
  nl = TRUE,
  logalpha1 ~ 1 + (1 | dimension/item),
  theta1 ~ 0 + (0 + dimension | person),
  beta ~ 0 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)

It should be noted that the estimation time for this model was the same as for the model with the mapping vectors. It models the underlying parameters better but the elpd_loo does not improve more than one se_diff compared to the model that does not model different logalpha SDs.

(*): You can pass stan code via stan vars to set the ranefs of these to 0 manually. However these parameters don’t have an effect on the modelling process at all, because they are multiplied by 0 anyway. So to set these to zero is more cosmetic:

stan_code <- 'r_5_logalpha1_1 = r_5_logalpha1_1 .* map_alpha1;
r_6_logalpha2_1 = r_6_logalpha2_1 .* map_alpha2;
r_7_logalpha3_1 = r_7_logalpha3_1 .* map_alpha3;'

Topic		Replies	Views
Higher-order IRT model in brms brms specification , irt , hierarchical-model	1	990	October 14, 2020
Brms cumulative model with varying response categories brms	2	140	June 23, 2024
Multivariate PCM Modeling	14	862	March 4, 2020
Ordinal IRT for Latent Variable in BRMS Modeling	8	1245	August 4, 2021
2PL IRT Efficiency Questions brms fitting-issues , specification , performance	11	1012	May 26, 2020

Multidimensional IRT in brms

Related topics