Hi all, I’m trying to model the effects of an intervention on an accuracy score which was part of a cognitive test. Participants had to perform a series of arithmetic calculations and were given a score between 0% and 100% for the number of responses which were correct.
I am trying to model this data in brms but have not found a satisfying method. The first issue is that many of the participants did very well and get 100% (or close to) on the test. So there is clustering and a ceiling effect at high scores. The second issue is that some participants did quite poorly and got low scores (0-10%), which are far away from the sample mean and may end up being highly influential.
I have clumsily simulated some data:
x <- rnorm(200)
acc <- rnorm(200, mean = 1, sd = 0.2)
acc <- ifelse(acc > 1, rnorm(200, mean = 0.95, sd = 0.05), acc)
acc <- ifelse(acc > 1, 1, acc)
acc[c(1:5)] <- c(0, 0.05, 0.07, 0.2, 0.3)
d <- tibble(x = x, acc = acc)
This is what the distribution of simulated acc variable looks like:
My first thought was to fit the model using a truncated normal:
m1_trunc <- brm(acc | trunc(ub = 1) ~ x,
data = d,
family = gaussian(),
prior = prior(normal(0,5), class = "b"),
cores = 4, chains = 4,
control = list(adapt_delta = 0.99),
backend = "cmdstanr")
But there seemed to be issues with exploring the posterior with this model, with Rhat values >= 1.01, low ESS, and parameter estimates off target.
Skew_normal() and student() distributions on the other hand have no convergence issues and return the correct parameter value. But both do a poor job of reproducing the data:
m2_skew <- brm(acc ~ x,
family = skew_normal(),
prior= prior(normal(0, 100), class = "alpha"),
control = list(adapt_delta = 0.99),
cores = 4, chains = 4,
data = d,
backend = "cmdstanr")
pp_check(m2_skew, ndraws = 30)
m3_student <- brm(acc ~ x,
family = student(),
data = d,
chains = 4, cores = 4,
control = list(adapt_delta = 0.99),
backend = "cmdstanr")
pp_check(m3_student, ndraws = 30)
In any case, neither of these models really match the process that produced the data. I am sure that there is a more appropriate approach. In the case of reaction time data, specific models have been developed which match that data well (e.g., shifted log-normal). But I’m not aware of corresponding models in the case of accuracy scores.
How best might we model data of this kind? Would some kind of mixture be appropriate which models 1) the probability of having a perfect (or very high) score and 2) the mean accuracy score?
Thanks!