Brms: 2PL IRT for unbalanced panel data

Hello all,

I try to build a 2PL IRT binary model for the unbalanced panel data using brms. I try to estimate \theta_{it}, where i is country and t is year.

My 2PL model may take a mathematical form as follows.

Pr(y_{itj}=1) = \frac{e^{\alpha_{j}(\Theta_{it} +\beta_{j} ) }}{1+e^{\alpha_{j}(\Theta_{it} +\beta_{j} )}}

, where Pr(y_{itj} = 1) is the probability that takes the value 1 if the item j is observed from the country i in year t and 0 otherwise. And, the item easiness parameter \beta_{j}, and the item discrimination parameter \alpha_{j}. My data covers 77 countries between 1965 and 2005 (unbalanced panel) and 19 items are administered. The data looks like,

A tibble: 56,772 x 5
   Country_Year Country_CCODE  Year variable         value
          <int>         <int> <int> <fct>            <int>
 1       401965            40  1965 Military_Defense     0
 2       401966            40  1966 Military_Defense     0
 3       401967            40  1967 Military_Defense     0
 4       401968            40  1968 Military_Defense     0
 5       401969            40  1969 Military_Defense     0
# ... with 56,762 more rows

, where “Country_Year” is country-year (my main interest), “Country-CCODE” is country code, and “variable” is item, and “value” is binary response, 0 or 1.

Building on @paul.buerkner’s materials of SPM-IRT-models/SPM_IRT_analysis.Rmd at master · paul-buerkner/SPM-IRT-models · GitHub and https://arxiv.org/pdf/1905.09501.pdf, I tried to build a model like this,

  formula_civil_2pl_16 <- bf(
     value ~ beta + alpha * theta,
     theta ~ 0 + Country_Year,
     beta ~ 0 + variable,
     alpha ~ 0 + variable,
     nl = TRUE
  )
  
  prior_civil_2pl_16 <-
     prior("normal(0, 10)", class = "b", nlpar = "beta", lb=-6, ub=6) +
     prior("gamma(4, 3)", class = "b", nlpar = "alpha", lb=0) +
     prior("normal(0, 1)", class = "b", nlpar = "theta")
  
  fit_civil_2pl_16 <- brm(
     formula = formula_civil_2pl_16,
     data = civil_2pl_1_1_long,
     family = brmsfamily("bernoulli", "logit"),
     prior = prior_civil_2pl_16,
     cores =4,
  )

But, I could not get \theta for each year and country. Instead, I could only get a population-level effect of \theta like,

Population-Level Effects: 
                                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
theta_Country_Year                       -0.00      0.00    -0.00     0.00 2.84        5       29

Of course, I can try to build a multi-level model as the papers of brms IRT suggests like this,

formula_civil_2pl_17 <- bf(
     value ~ beta + alpha * theta,
     theta ~ 0 + (1 | Country_Year),
     beta ~ 0 + variable,
     alpha ~ 0 + variable,
     nl = TRUE
  )
  
  prior_civil_2pl_17 <-
     prior("normal(0, 10)", class = "b", nlpar = "beta", lb=-6, ub=6) +
     prior("gamma(4, 3)", class = "b", nlpar = "alpha", lb=0) +
     prior("normal(0, 1)", class = "sd", group = "Country_Year", nlpar = "theta")

,which allows me to get a group-level effect and \theta for each country-year like this,

Group-Level Effects: 
~Country_Year (Number of levels: 2988) 
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(theta_Intercept)     0.71      0.08     0.57     0.89 1.00      544     1162
get_variables(fit_civil_2pl_17)
 [39] "sd_Country_Year__theta_Intercept"         "r_Country_Year__theta[401965,Intercept]" 
  [41] "r_Country_Year__theta[401966,Intercept]"  "r_Country_Year__theta[401967,Intercept]" 
  [43] "r_Country_Year__theta[401968,Intercept]"  "r_Country_Year__theta[401969,Intercept]" 
  [45] "r_Country_Year__theta[401970,Intercept]"  "r_Country_Year__theta[401971,Intercept]" 

The problem is, however, I can’t think of hierarchical (or mixed-effect) modeling for the “country-year” level, although the hierarchical structure is possible for “country” level since country-year observations within a country might be correlated.

So, my first question is: how I can construct a non-hierarchical model to estimate \theta for every country-year by setting a prior (0,1) on \theta_{it} in brms irt?

The second question is a little more complicated: how I can estimate \theta for every country-year, and, simultaneously, reflect the hierarchical structure at “country” level?

Thank you for reading a long question. Any tips will be helpful.

I’ve attached two recent papers that use multilevel item-response models to generate smoothed country-level cross-sectional time series panels using Stan. While their application is to public opinion, they’re likely directly relevant to your issue and both have detailed replication materials available online.

Kolczynska et al (2020)_Modeling public opinion over time and space_Trust in state institutions in Europe, 1989-2019.pdf (891.7 KB)

Classen_2019_Estimating Smooth Country–Year Panels of Public Opinion.pdf (2.3 MB)

1 Like

Thank you! Although I have to figure out how I can customize Stan code for my purpose (it seems harder for the beginners like me), I can start from your materials.

1 Like