Writing out the linear model equations for multilevel IRT in BRMS/Stan

Hi all,
I’m truly stuck. I’m trying to specify a relatively complicated IRT model in BRMS. I’ve been following Paul Burkner’s Bayesian IRT paper, which is very helpful from the coding perspective. However, I’m trying to write out the model to include in my research paper, and that’s proving difficult. Given the formula below, is anyone able to provide some assistance with writing out the various levels of the model equations? I’d be very grateful!

mod1 ← bf(
response ~ exp(logalpha) * eta,
eta ~ 1 + cov1 + cov2 + cov3 + cov4+ time + (1 | i | item) + (1 + year | id),
logalpha ~ 1 + (1 | i | item),
nl = TRUE
)

I will try to take a shot at it. But first, it would be helpful to know whether your cov, time, and year variables are unique to each person, unique to each item, or unique to each person-by-item combination. I would guess that they are mostly unique to each person, but year is possibly unique to each item. Also, I assume that id is a person indicator, but that would be helpful to clarify.

Ed, thanks so much for being willing to help! Covariates (covs) would be specific to each person. Apologies for the confusion, but time and year are intended to be the same, which is “year” and indeed I’m hoping for a measurement for each person per year. I would assume that item parameters are constant over time. And, yes, id is the person indicator. This is hugely generous of you, thank you again.

Those clarifications are helpful. I guess that your response variable might be binary or ordinal, in which case this would be embedded in a binomial logit or probit likelihood. But I am just focusing on what is inside the bf() command that you provided. So y_{ijk} is the response of person i on item j at time k, which might be something like the log odds of P(y_{ijk} = 1) depending on the model.

y_{ijk} = \alpha_j \times (\delta_j + \theta_{ik}) \\ \theta_{ik} = \beta_{0i} + \beta_1(cov_{i1}) + \beta_2(cov_{i2}) + \beta_3(cov_{i3}) + \beta_4(cov_{i4}) + \beta_{5i} year_{ik} \\ (\beta_{0i}\ \beta_{5i})' \sim N_2((\mu_{\beta0}\ \mu_{\beta5})', \Sigma_b) \\ (\delta_j\ \log(\alpha_j))' \sim N_2((\mu_\delta\ \mu_{\log(\alpha)})', \Sigma_{item}) \\

I have mixed feelings about writing these equations in applied papers. On the one hand, it is always good to know exactly what model is being estimated. On the other hand, in applied papers, I find that the equations are incorrect more often than not (and there is a chance that I made a mistake above!). So I think that sharing the code and data is just as important as including the equations.

Hello! Thank you so much for having a look - this is extremely helpful and I get your point about the risk of including equations. You are right to assume response variable is binary - I’m using the logit link. I’ve never seen ~ N_2( , ) - my best guess is this indicates a characterization of a joint distribution of the variables on the left - is that right? thank you again!

Yes, N_2 is the bivariate normal distribution, which allows for a correlation between the parameters on the left of the tilde.