I am testing a series of item response models related to a verbal learning test. The test itself is comprised of 10 words that are read by the participant in 3 different trials. After each trial, they are asked to recall from memory as many words as they can. This particular test presents the 10 words in different orders for each of the trials as well. I can’t share the data, but I’ve defined my variables below.
Variables:
- Resp - response variable, scored 0 (did not recall) or 1 (recalled) for each word in each trial
- Item - identification variable, factor (levels 1-10) identifying each word
- ID - identification variable, factor (levels 1-1478) identifying each participant
- ItemPos - factor (levels 1-10) to identify what position (order) the item was presented in for that trial
- TrialID - factor (levels 1-3) to identify which trial the response belongs to
- LocDep - identification variable defining dependency between items (i.e., correct recall of a word on a later trial is coded to be dependent on recall for the same word on a previous trial) – coding and inclusion in syntax comes from the De Boeck paper on using lme4 for IRT
- ItemPosDif - a DIF variable coded to examine whether position effects differ between trials – coding and inclusion in synatx comes from the De Boeck paper and here
Just as an example of the syntax, the following is my call for the “base” model being examined:
Priors_Rasch <-
prior("normal(0, 3)", class = "sd", group = "ID") +
prior("normal(0, 3)", class = "sd", group = "Item")
Rasch_1D <- brm(Resp ~ 1 + LocDep + (1 | Item) + (1 + LocDep | ID),
data = df_long,
family = bernoulli(link = "logit"),
prior = Priors_Rasch,
iter = 3000, warmup = 1000,
control = list(adapt_delta = 0.99))
So far, I’ve been using the priors and general model syntax provided by Paul Burkner’s wonderfully helpful publication. My goal is to systematically examine several different models (e.g., Rasch v. 2PL, unidimensional v. multidimensional, no DIF v. DIF, etc). I expect Stan to take a while fitting the models since they are (a) complicated and (b) fairly large (n = 44,340 in long format). The Rasch model shown above takes about 2.5 hours without issues. Running the same model but the 2PL version takes ~26 hours.
I wanted to see whether anyone has suggestions for cutting down the time or improving the efficiency of the model. I have toyed with the idea of brms_multiple()
on an EC2 instance from AWS where the dataset is split into 4 and simultaneously estimated, but I didn’t find a huge improvement in modelling time on smaller datasets. I am currently using WAIC to inform model selection (2PL and dimensions for each trial are better than Rasch and unidimensional models, but I haven’t been able to get a “clean” run of the multidimesional 2PL model yet since I’ve only run it with adapt_delta = 0.80
).
A related question that I have is whether anyone has experience using something like a horseshoe prior for IRT models. I have seen a couple papers using a horseshoe prior for Bayesian Neural Network IRT models, but none in a general linear mixed model. All the models will have fixed and random effects modeled for all the covariates, so I suspect that being able to cut down on the estimation of the large number of coefficients being computed would be more meaningful for reducing estimation time.
Session info:
- Operating System: Windows 10, 64 bit
- brms Version: 2.12.0 (running Stan version 2.19.3 and R version 4.0.0)