Multidimensional IRT in brms

Dear multidim IRT modelers,
I have a question about two competing possibilties to spedify a 2PL MIRT model. The first one sfollows the code you showed in your paper about “Analysing Standard Progressive Matrices”, @paul.buerkner:

formula <- bf(
  response ~ beta + exp(logalpha1) * theta1,
  nl = TRUE,
  logalpha1 ~ 1 + (1 | item),
  theta1 ~ 0 + (0 + dimension | person),
  beta ~ 0 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)

I just alterered the (1 | person) to (0 + dimension | person) and dropped the item correlations. The second specification looks like this:

formula <- bf(
  response ~ beta + exp(logalpha1) * map_dim1 * theta1 +
    exp(logalpha2) * map_dim2 * theta2 +
    exp(logalpha3) * map_dim3 * theta3,
  nl = TRUE,
  theta1 ~ 0 + (1 |p| person),
  theta2 ~ 0 + (1 |p| person),
  theta3 ~ 0 + (1 |p| person),
  beta ~ dimension + (1 | item),
  logalpha1 ~ 1 + (1 | item),
  logalpha2 ~ 1 + (1 | item),
  logalpha3 ~ 1 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)

The map_dim terms refer to integer columns in the data and are 1 if the item is related to the dimension and 0 else. I came up with this approach for models where items can have multiple dimensions and where the number of dimensions a item is related to can differ. It works.

When I came back an began to make some comparisms with simulated data on a model that could be described by both models I found the following points: The second approach…

  • is slower. It takes almoast twice as long.
  • gives some wide spread estimators for the logalphas that are multiplied by 0 in the main formula (*).
  • gets a little lower loo elpd.
  • BUT better recovers the values for the logalpha SDs used to simulate the data.

The first model pools al logalphas so the SD over all logalphas corresponding to one dimension get shrunken to much and the difference in the SD between the dimensions gets shrunken as well. I think this is a drawback and I would like to hear your opion about this.

One possible solution I found after several tries to was:

formula <- bf(
  response ~ beta + exp(logalpha1) * theta1,
  nl = TRUE,
  logalpha1 ~ 1 + (1 | dimension/item),
  theta1 ~ 0 + (0 + dimension | person),
  beta ~ 0 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)

It should be noted that the estimation time for this model was the same as for the model with the mapping vectors. It models the underlying parameters better but the elpd_loo does not improve more than one se_diff compared to the model that does not model different logalpha SDs.

(*): You can pass stan code via stan vars to set the ranefs of these to 0 manually. However these parameters don’t have an effect on the modelling process at all, because they are multiplied by 0 anyway. So to set these to zero is more cosmetic:

stan_code <- 'r_5_logalpha1_1 = r_5_logalpha1_1 .* map_alpha1;
r_6_logalpha2_1 = r_6_logalpha2_1 .* map_alpha2;
r_7_logalpha3_1 = r_7_logalpha3_1 .* map_alpha3;'