# Brms: 2PL IRT for unbalanced panel data

Hello all,

I try to build a 2PL IRT binary model for the unbalanced panel data using brms. I try to estimate \theta_{it}, where i is country and t is year.

My 2PL model may take a mathematical form as follows.

Pr(y_{itj}=1) = \frac{e^{\alpha_{j}(\Theta_{it} +\beta_{j} ) }}{1+e^{\alpha_{j}(\Theta_{it} +\beta_{j} )}}

, where Pr(y_{itj} = 1) is the probability that takes the value 1 if the item j is observed from the country i in year t and 0 otherwise. And, the item easiness parameter \beta_{j}, and the item discrimination parameter \alpha_{j}. My data covers 77 countries between 1965 and 2005 (unbalanced panel) and 19 items are administered. The data looks like,

A tibble: 56,772 x 5
Country_Year Country_CCODE  Year variable         value
<int>         <int> <int> <fct>            <int>
1       401965            40  1965 Military_Defense     0
2       401966            40  1966 Military_Defense     0
3       401967            40  1967 Military_Defense     0
4       401968            40  1968 Military_Defense     0
5       401969            40  1969 Military_Defense     0
# ... with 56,762 more rows


, where “Country_Year” is country-year (my main interest), “Country-CCODE” is country code, and “variable” is item, and “value” is binary response, 0 or 1.

Building on @paul.buerkner’s materials of SPM-IRT-models/SPM_IRT_analysis.Rmd at master · paul-buerkner/SPM-IRT-models · GitHub and https://arxiv.org/pdf/1905.09501.pdf, I tried to build a model like this,

  formula_civil_2pl_16 <- bf(
value ~ beta + alpha * theta,
theta ~ 0 + Country_Year,
beta ~ 0 + variable,
alpha ~ 0 + variable,
nl = TRUE
)

prior_civil_2pl_16 <-
prior("normal(0, 10)", class = "b", nlpar = "beta", lb=-6, ub=6) +
prior("gamma(4, 3)", class = "b", nlpar = "alpha", lb=0) +
prior("normal(0, 1)", class = "b", nlpar = "theta")

fit_civil_2pl_16 <- brm(
formula = formula_civil_2pl_16,
data = civil_2pl_1_1_long,
family = brmsfamily("bernoulli", "logit"),
prior = prior_civil_2pl_16,
cores =4,
)


But, I could not get \theta for each year and country. Instead, I could only get a population-level effect of \theta like,

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
theta_Country_Year                       -0.00      0.00    -0.00     0.00 2.84        5       29


Of course, I can try to build a multi-level model as the papers of brms IRT suggests like this,

formula_civil_2pl_17 <- bf(
value ~ beta + alpha * theta,
theta ~ 0 + (1 | Country_Year),
beta ~ 0 + variable,
alpha ~ 0 + variable,
nl = TRUE
)

prior_civil_2pl_17 <-
prior("normal(0, 10)", class = "b", nlpar = "beta", lb=-6, ub=6) +
prior("gamma(4, 3)", class = "b", nlpar = "alpha", lb=0) +
prior("normal(0, 1)", class = "sd", group = "Country_Year", nlpar = "theta")


,which allows me to get a group-level effect and \theta for each country-year like this,

Group-Level Effects:
~Country_Year (Number of levels: 2988)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(theta_Intercept)     0.71      0.08     0.57     0.89 1.00      544     1162

get_variables(fit_civil_2pl_17)
[39] "sd_Country_Year__theta_Intercept"         "r_Country_Year__theta[401965,Intercept]"
[41] "r_Country_Year__theta[401966,Intercept]"  "r_Country_Year__theta[401967,Intercept]"
[43] "r_Country_Year__theta[401968,Intercept]"  "r_Country_Year__theta[401969,Intercept]"
[45] "r_Country_Year__theta[401970,Intercept]"  "r_Country_Year__theta[401971,Intercept]"


The problem is, however, I can’t think of hierarchical (or mixed-effect) modeling for the “country-year” level, although the hierarchical structure is possible for “country” level since country-year observations within a country might be correlated.

So, my first question is: how I can construct a non-hierarchical model to estimate \theta for every country-year by setting a prior (0,1) on \theta_{it} in brms irt?

The second question is a little more complicated: how I can estimate \theta for every country-year, and, simultaneously, reflect the hierarchical structure at “country” level?

Thank you for reading a long question. Any tips will be helpful.

I’ve attached two recent papers that use multilevel item-response models to generate smoothed country-level cross-sectional time series panels using Stan. While their application is to public opinion, they’re likely directly relevant to your issue and both have detailed replication materials available online.

1 Like

Thank you! Although I have to figure out how I can customize Stan code for my purpose (it seems harder for the beginners like me), I can start from your materials.

1 Like