I am trying to model count data, but based on the posterior distribution (pp_checks) I cannot parametrize the model correctly I think.
About the data:
Participants perform a task (twice) that requires the to change between locations, and we want to measure and model the number of changes as a function of an earlier priming factor with 2 levels.
Note that the minimum possible count is 4 which suggests a truncated model. Or I can subtract 4 from the count, but I am first trying to model the distribution as is.
I have fitted the following models
# the data look like this (40% excerpt): counts <- c(5, 4, 4, 5, 6, 6, 5, 5, 5, 6, 6, 4, 4, 4, 4, 6, 7, 5, 4, 5, 5, 5, 4, 6, 10, 5, 5, 4, 5, 4, 6, 5) prime <- c(rep('A', 16), rep('B', 16)) task <- c(rep('1', 8), rep('2', 8), rep('1', 8), rep('2', 8)) subject <- paste0("S", 1:32) data <- tibble(counts, prime, task, subject) # models ## standard m1 <- brm(counts ~ 1 + prime + task + (1|subject) , family= negbinomial(), prior = set_prior("normal(0, 5", class = "b"), data, save_all_pars = T) ## zero truncated m2 <- brm(counts | trunc(lb = 4) ~ 1 + prime + task + (1|subject) , family= negbinomial(), prior = set_prior("normal(0, 5", class = "b"), data, save_all_pars = T)
I have tried this with poisson, negative binomial distributions (as well as other options, like gamma etc). I have also tried adding a couple other parameters (e.g. gender, but they dont’ make a difference).
From the posterior checks, lognormal provides the best fit but is inappropriate (assumes continuous outcome), and then poisson / negbinomial are close.
non-truncated neg binomial
truncated neg binomial (results similar with poisson)
However, as you can see in the figure, in the pp_checks, I notice that the models do no capture well the 4s (overs predict) and 5s (underpredict).
I am not sure how to improve the fit before deciding on a model with loocv.
Any advice most welcome!
- brms beginner ;-)
- Operating System: Mac OSX
- brms Version: brms_2.14.4