- Operating System: MacOS
- brms Version: 2.16.2
I’m fitting a multilevel IRT model to experimental data. The design is that all 223 participants responded to the same 5 items in the context of 3 experimental conditions (control, healthy, distressed). All five items were Likert-type questions with 6 response options. I’ve been fitting the Likert-type ratings (rating
) as depending on condition
. The values are nested in both item
and id
(participant identifier). My preferred likelihood setting is family = cumulative(probit)
.
Thus, my initial formula
syntax was:
rating | thres(gr = factor(item)) ~ 1 + condition + (1 + condition | item) + (1 + condition | id)
However, I wanted to make the level-2 correlation structure more meaningful, so I also reparametrized the model formula like so:
rating | thres(gr = factor(item)) ~ 1 + condition + (0 + condition | item) + (0 + condition | id)
My initial assumption was that this would change the level-2 sd/correlation parameters a bit and maybe help/hurt the effective sample size estimates. But I presumed this would not change the level-1 thresholds or \beta parameters. I was wrong.
Leaving out technical settings like cores
and so on, here’s the code I used to fit the models:
# 26.01536 mins
avoid_3_all <-
brm(data = d_avoidance_items,
family = cumulative(probit),
rating | thres(gr = factor(item)) ~ 1 + condition + (1 + condition | item) + (1 + condition | id),
prior = c(prior(normal(0, 2), class = Intercept),
prior(normal(0, 1.5), class = b),
prior(exponential(1.0 / 1.5), class = sd),
prior(lkj(1), class = cor)))
# 15.6703 mins
avoid_3b_all <-
brm(data = d_avoidance_items,
family = cumulative(probit),
rating | thres(gr = factor(item)) ~ 1 + condition + (0 + condition | item) + (0 + condition | id),
prior = c(prior(normal(0, 2), class = Intercept),
prior(normal(0, 1.5), class = b),
prior(exponential(1.0 / 1.5), class = sd),
prior(lkj(1), class = cor)))
The thing I don’t understand is that the threshold posteriors for the first model are about twice as wide as those for the second model. Here’s what that looks like in a coefficient plot:
bind_rows(posterior_summary(avoid_3_all)[1:39, ] %>% data.frame() %>% rownames_to_column("parameter"),
posterior_summary(avoid_3b_all)[1:39, ] %>% data.frame() %>% rownames_to_column("parameter")) %>%
mutate(fit = rep(c("avoid_3_all", "avoid_3b_all"), each = n() / 2)) %>%
ggplot(aes(x = parameter, y = Estimate, ymin = Q2.5, ymax = Q97.5, color = fit)) +
geom_rect(xmin = 2.5, xmax = 27.5, ymin = -Inf, ymax = Inf,
fill = "grey95", size = 0) +
geom_hline(yintercept = 0, size = 1/4, color = "grey67") +
geom_pointrange(size = 1/3, fatten = 3/4, position = position_dodge(width = -0.6)) +
scale_color_manual(values = c("black", "red3")) +
xlab(NULL) +
coord_flip() +
theme_bw() +
theme(axis.text.y = element_text(hjust = 0),
panel.grid = element_blank())
Though it’s no surprise the level-2 sd/correlations have changed (upper white section), I’m shocked by how much more precise the thresholds are for the second model (in red in the middle gray section).
Also, even though they’re both of the same data and the same predictors and number of parameters, their LOO/WAIC values are a little different.
avoid_3_all <- add_criterion(avoid_3_all, criterion = c("loo", "waic"))
avoid_3b_all <- add_criterion(avoid_3b_all, criterion = c("loo", "waic"))
loo_compare(avoid_3_all, avoid_3b_all) %>% print(simplify = F)
elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic se_looic
avoid_3_all 0.0 0.0 -2663.0 58.6 473.9 18.9 5325.9 117.3
avoid_3b_all -0.6 1.6 -2663.6 58.7 475.6 19.1 5327.2 117.4
Can anyone help me understand why my simple reparameterization of the level-2 sd/correlation structure would have such a profound change to the other parts of the model?
If it’s of any help, you can download the two fit objects at GitHub - ASKurz/Conditional-multilevel-IRT-and-the-level-2-correlation-structure
Tagging @matti for good measure.