I am working to fit an IRT model in brms in R with PISA test score data. The model based on the guide for IRT models , found here. I have constructed the model to have hierarchical priors on the person parameters (estimates of latent ability per student) and have not enforced a hierarchy on item parameters (model question easiness).
When I run the model, I encounter a sampling issue which never resolves when running on the full dataset:
Chain 2: Rejecting initial value:
Chain 2: Log probability evaluates to log(0), i.e. negative infinity.
Initially, I believed this was from a bad choice of priors. Fiddling with the priors in admittedly arbitrary ways reduced the issue when running on subset of the data but has not resolved the issue when running on the full dataset. The code for the formula, priors and model are as follows:
# Formula for the model
pisa_formula_finlit <- bf(
scored_response ~ lang_spoken + escs + sex + native_status + country_by_language +
(1 | cntstuid) + (0 + country_by_language |i| item),
disc ~ country_by_language + (0 + country_by_language |i| item)
)
# Develop a set of priors for the model.
my_priors <- set_prior('normal(0, 1)', class = 'Intercept') +
set_prior('normal(1.5, 3)', class = 'b') +
set_prior('normal(0, 5)', class = 'sd', group = 'cntstuid') +
set_prior('normal(0, 5)', class = 'sd', group = 'item') +
set_prior('normal(0, 5)', class = 'sd', group = 'item', dpar = 'disc')
# Code to run the model
pisa_model_finlit <- brm(
formula = pisa_formula_finlit,
data = finlit_data_clean,
family = brmsfamily("cumulative", "logit"),
prior = my_priors,
cores = parallel::detectCores()
)
The full dataset which I intend to run the model on using AWS is roughly 1.1 million observations large.
Any advice for how I might debug the model or respecify priors to reduce the issue would be greatly appreciated.