Hi everyone,
I am trying to model count data, but based on the posterior distribution (pp_checks) I cannot parametrize the model correctly I think.
About the data:
-
Participants perform a task (twice) that requires the to change between locations, and we want to measure and model the number of changes as a function of an earlier priming factor with 2 levels.
-
Note that the minimum possible count is 4 which suggests a truncated model. Or I can subtract 4 from the count, but I am first trying to model the distribution as is.
I have fitted the following models
# the data look like this (40% excerpt):
counts <- c(5, 4, 4, 5, 6, 6, 5, 5, 5, 6, 6, 4, 4, 4, 4, 6, 7, 5, 4, 5, 5, 5, 4, 6, 10, 5, 5, 4, 5, 4, 6, 5)
prime <- c(rep('A', 16), rep('B', 16))
task <- c(rep('1', 8), rep('2', 8), rep('1', 8), rep('2', 8))
subject <- paste0("S", 1:32)
data <- tibble(counts, prime, task, subject)
# models
## standard
m1 <- brm(counts ~ 1 + prime + task + (1|subject) ,
family= negbinomial(),
prior = set_prior("normal(0, 5", class = "b"),
data,
save_all_pars = T)
## zero truncated
m2 <- brm(counts | trunc(lb = 4) ~ 1 + prime + task + (1|subject) ,
family= negbinomial(),
prior = set_prior("normal(0, 5", class = "b"),
data,
save_all_pars = T)
I have tried this with poisson, negative binomial distributions (as well as other options, like gamma etc). I have also tried adding a couple other parameters (e.g. gender, but they dont’ make a difference).
From the posterior checks, lognormal provides the best fit but is inappropriate (assumes continuous outcome), and then poisson / negbinomial are close.
non-truncated neg binomial
truncated neg binomial (results similar with poisson)
However, as you can see in the figure, in the pp_checks, I notice that the models do no capture well the 4s (overs predict) and 5s (underpredict).
I am not sure how to improve the fit before deciding on a model with loocv.
Any advice most welcome!
- brms beginner ;-)
- Operating System: Mac OSX
- brms Version: brms_2.14.4