Hi everyone,
I have data from great apes, where we measured how long they stayed in a specific zone. Each of the apes did 3 trials, corresponding to 3 conditions. Because it is time data, and because we stopped measuring after 6 minutes, the time parameter space is between 0 and 360 seconds.
I fitted a brms model to model this and assess the difference between conditions:
Specify the priors
# Specify the priors for the model
prior <- c(
prior(normal(0, 10), class = "Intercept"),
prior(normal(0, 10), class = "b"),
prior(inv_gamma(0.1, 0.1), class = "sigma")
)
Specify the model formula for conditions
# Specify the model formula
formula <- approach_time_Final|trunc(lb=-0.01) ~ condition + (1 | subject)
Fit the model
# Fit a log link Gaussian model
model <- brm(formula, data = Apedata, family = ???,
prior = prior, iter = 50000 , chains = 3, warmup = 20000, cores = 6,control = list(adapt_delta = 0.9) ,save_pars = save_pars(all = TRUE))
However, for some reason I can’t find a proper family distribution that fits the data. There is a relatively high number of low values in there (and even a fair amount of zeros).
I have tried a variety of distributions (including survival distributions in which I set the zeros to 0.001), but something always goes wrong in a different part of subsequent data analysis:
- sometimes the model itself does not converge, shows poor Gelman or Geweke statistics
- Some distributions just fall outside of the parameter space (e.g., contain negative values or values much higher than 360). And truncating them results in one of the other problems.
- Sometimes the posterior predictive check does not even compute (I get NA warnings)
- Sometimes it fails to then generate expected predictive values (which I need for calculating contrasts)
- And then, finally, I one time had a problem with a distribution that seemed fine in all of these ways, until I tried to compute Bayesfactors between this model and a model with additional covariate predictors, and then I got an error that some estimated likelihoods weer inf/inf or -inf/-inf, suggesting that this was also not an appropriate distribution.
So if anyone has any ideas on what distribution to use, it would be greatly appreciated!
Best,
Wouter