How to implement Gelman and Hill's shift and scale approach to nonidentifiabiltiy in Item Response model


A simple Rasch model in Item Response Theory is as follows:

\Pr(y_{jk} = 1) = {\rm logit}^{-1}(\alpha_j - \beta_k),

where \alpha_j represents the latent ability of person j and \beta_k represents the difficulty of item k. However, this model is not identified because a constant can be added to all the abilities \alpha_j and all the difficulties \beta_k and the model remains the same (e.g., ((\alpha_j + c) - (\beta_k + c))). They propose to address this by shifting the ability and difficulty parameters using the mean of the \alpha_j's and scaling the ability and difficulty parameters using the standard deviation of the \alpha_j's, so that \mu_a is set to 0 to identify the model.

Their BUGS code is available from, in ./Book_Codes/Ch.19/19.5_Bugs_codes.bug.

What I am trying to do
Below is my attempt to implement it in Stan.

rasch_code <- "data {
  int <lower=1> J;             // number of students
  int <lower=1> K;             // number of questions
  int <lower=1> N;             // number of observations
  // [9/15/23]: installed rstan 2.26.24 from source; now array[N] y works
  array[N] int<lower=1, upper=J> jj;  // student for observation n
  array[N] int<lower=1, upper=K> kk;  // question for observation n
  array[N] int<lower=0, upper=1> y;   // correctness for observation n
parameters {
  vector[J] a_raw;
  vector[K] b_raw;

  real mu_a_raw;
  real<lower=0> sigma_a_raw;

  real mu_b_raw;
  real<lower=0> sigma_b_raw;
transformed parameters {
  vector[J] a;
  vector[K] b;

  real shift;
  real scale;
  shift = mean(a);
  scale = sd(a);

  a = (a_raw - shift)/scale;
  b = (b_raw - shift)/scale;
model {

  a_raw ~ normal(mu_a_raw, sigma_a_raw);
  b_raw ~ normal(mu_b_raw, sigma_b_raw);

  mu_a_raw ~ normal(0, 2.5);
  sigma_a_raw ~ exponential(1);

  mu_b_raw ~ normal(0, 2.5);
  sigma_b_raw ~ exponential(1);

  y ~ bernoulli_logit(a[jj] - b[kk]);


Note that I use the transformed parameters block to do the shifting and scaling.

  a = (a_raw - shift)/scale;
  b = (b_raw - shift)/scale;

The transformed parameters block makes sense to me. But the error message below suggests otherwise.

Here is the R code:

data('VerbAgg', package = "lme4")   # 316 people by 24 items each

J <- length(unique(VerbAgg$id))      # n of students/respondents
K <- length(unique(VerbAgg$item)) # n of questions
N <- nrow(VerbAgg)                         # n of obs in long data format
jj <- match(VerbAgg$id, unique(VerbAgg$id))     # index for people
kk <- match(VerbAgg$item, unique(VerbAgg$item)) # index for item

standat <- list(
  J = J,
  K = K,
  N = N,
  jj = jj,
  kk = kk,
  y = as.numeric(VerbAgg$r2) - 1  # 1="N", 2="Y"

options(mc.cores = parallel::detectCores())

rasch_fit <- stan(model_code = rasch_code, data = standat,
             model_name = "Rasch_shift")

Error message

I get the rejecting initial value error. I understand the error. The transformed parameters are not initialized. Their values are all nan so that the bernoulli_logit(a[jj] - b[kk]) fails.

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Random variable[1] is nan, but must be not nan! (in 'Rasch_shift', line 31, column 2 to column 36)
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 1 attempts.
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in eval(ei, envir) : Initialization failed."
error occurred during calling the sampler; sampling not done

How do I fixed this?

I know I don’t have to use shift and scale. I can get identifiability by simply setting \mu_a to 0, or by setting \mu_b to 0 (but not both). But the shift and scale approach looks more elegant. And there should be a way to get it to work in Stan. Thoughts? Many thanks in advance.

BTW, the same error if I set initial values manually:

initfun <- function(...) {
  list( a_raw       = rnorm(n=J, mean=0, sd=1),
        b_raw       = rnorm(n=K, mean=0, sd=1),
        mu_a_raw    = runif(1, min=0, max=5),
        mu_b_raw    = runif(1, min=0, max=5),
        sigma_a_raw = rexp(n=1, rate=1),
        sigma_b_raw = rexp(n=1, rate=1),
        shift       = runif(1, min=1, max=5),
        scale       = runif(1, min=1, max=5),
        a = rnorm(n=J, mean=0, sd=1),
        b = rnorm(n=K, mean=0, sd=1))    }

rasch_fit <- stan(model_code = rasch_code, data = standat,
             init = initfun, model_name = "Rasch_shift")
1 Like

I don’t really know how the model works, but maybe defining shift and scale in terms of a_raw instead of a would help?

This does work but they don’t recommend it, especially with Stan. Just set an informative N(0,1) or equivalent prior on the ability scores and it’ll be fine.

Thanks for the suggestions.

I am afraid, though, defining shift and scale by a_raw seems to destabilize the model with many warnings. Informative prior N(0, 1) works, or simply setting \mu_a = 0, I have tried both. But I hope to get shift and scale to work using Stan, especially if I want to add an item slope \gamma_k.

You just constrain the gamma_k to be positive, and it’s fine.

True. And that is how I did it before. I am hoping to move to a more elegant solution, like the shift and scale in Gelman and Hill. Thanks for your continued input.