Background
A simple Rasch model in Item Response Theory is as follows:
\Pr(y_{jk} = 1) = {\rm logit}^{-1}(\alpha_j - \beta_k),
where \alpha_j represents the latent ability of person j and \beta_k represents the difficulty of item k. However, this model is not identified because a constant can be added to all the abilities \alpha_j and all the difficulties \beta_k and the model remains the same (e.g., ((\alpha_j + c) - (\beta_k + c))). They propose to address this by shifting the ability and difficulty parameters using the mean of the \alpha_j's and scaling the ability and difficulty parameters using the standard deviation of the \alpha_j's, so that \mu_a is set to 0 to identify the model.
Their BUGS code is available from
http://www.stat.columbia.edu/~gelman/arm/examples/Book_Codes.zip, in ./Book_Codes/Ch.19/19.5_Bugs_codes.bug.
What I am trying to do
Below is my attempt to implement it in Stan.
rasch_code <- "data {
int <lower=1> J; // number of students
int <lower=1> K; // number of questions
int <lower=1> N; // number of observations
// [9/15/23]: installed rstan 2.26.24 from source; now array[N] y works
array[N] int<lower=1, upper=J> jj; // student for observation n
array[N] int<lower=1, upper=K> kk; // question for observation n
array[N] int<lower=0, upper=1> y; // correctness for observation n
}
parameters {
vector[J] a_raw;
vector[K] b_raw;
real mu_a_raw;
real<lower=0> sigma_a_raw;
real mu_b_raw;
real<lower=0> sigma_b_raw;
}
transformed parameters {
vector[J] a;
vector[K] b;
real shift;
real scale;
shift = mean(a);
scale = sd(a);
a = (a_raw - shift)/scale;
b = (b_raw - shift)/scale;
}
model {
a_raw ~ normal(mu_a_raw, sigma_a_raw);
b_raw ~ normal(mu_b_raw, sigma_b_raw);
mu_a_raw ~ normal(0, 2.5);
sigma_a_raw ~ exponential(1);
mu_b_raw ~ normal(0, 2.5);
sigma_b_raw ~ exponential(1);
y ~ bernoulli_logit(a[jj] - b[kk]);
}
"
Note that I use the transformed parameters block to do the shifting and scaling.
a = (a_raw - shift)/scale;
b = (b_raw - shift)/scale;
The transformed parameters
block makes sense to me. But the error message below suggests otherwise.
Here is the R code:
data('VerbAgg', package = "lme4") # 316 people by 24 items each
J <- length(unique(VerbAgg$id)) # n of students/respondents
K <- length(unique(VerbAgg$item)) # n of questions
N <- nrow(VerbAgg) # n of obs in long data format
jj <- match(VerbAgg$id, unique(VerbAgg$id)) # index for people
kk <- match(VerbAgg$item, unique(VerbAgg$item)) # index for item
standat <- list(
J = J,
K = K,
N = N,
jj = jj,
kk = kk,
y = as.numeric(VerbAgg$r2) - 1 # 1="N", 2="Y"
)
library('rstan')
options(mc.cores = parallel::detectCores())
rasch_fit <- stan(model_code = rasch_code, data = standat,
model_name = "Rasch_shift")
Error message
I get the rejecting initial value error. I understand the error. The transformed parameters are not initialized. Their values are all nan
so that the bernoulli_logit(a[jj] - b[kk])
fails.
SAMPLING FOR MODEL 'Rasch_shift' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Random variable[1] is nan, but must be not nan! (in 'Rasch_shift', line 31, column 2 to column 36)
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 1 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in eval(ei, envir) : Initialization failed."
error occurred during calling the sampler; sampling not done
How do I fixed this?
I know I don’t have to use shift and scale. I can get identifiability by simply setting \mu_a to 0, or by setting \mu_b to 0 (but not both). But the shift and scale approach looks more elegant. And there should be a way to get it to work in Stan. Thoughts? Many thanks in advance.
BTW, the same error if I set initial values manually:
initfun <- function(...) {
list( a_raw = rnorm(n=J, mean=0, sd=1),
b_raw = rnorm(n=K, mean=0, sd=1),
mu_a_raw = runif(1, min=0, max=5),
mu_b_raw = runif(1, min=0, max=5),
sigma_a_raw = rexp(n=1, rate=1),
sigma_b_raw = rexp(n=1, rate=1),
shift = runif(1, min=1, max=5),
scale = runif(1, min=1, max=5),
a = rnorm(n=J, mean=0, sd=1),
b = rnorm(n=K, mean=0, sd=1)) }
rasch_fit <- stan(model_code = rasch_code, data = standat,
init = initfun, model_name = "Rasch_shift")