Dear all,
I am a second-year Master’s student in ethology and cognition, currently doing an internship at the CNRS – Laboratory of Cognitive and Adaptive Neurosciences (UMR 7364), University of Strasbourg, France. I’m working on a project focusing on interspecific differences in exploratory behavior in two non-human primate species.
I would like to use a Bayesian multilevel model to evaluate the effects of several variables on subjects’ latencies to explore a novel stimulus, using the brms package.
Latency is the time (in seconds) it takes for an individual to approach and touch a novel stimulus. Non-responses are right-censored using cens(1 - insp_event), where 1 indicates the event occurred (exploration), and 0 indicates no event.
As I am completely new to Bayesian statistics and to brms, I would be extremely grateful for any advice you could provide to help me apply it appropriately to my data.
Here is my current model :
brm(data, latency | cens(1 – event) ~ var + var + var + (1 | random effect), family = lognormal(), chains = 4, cores = 4)
Ihave a few specific questions:
- Use of priors : Is it always necessary to define priors in the model? Should priors only reflect strong theoretical expectations, or should they be specified more generally for all parameters?
- Model fit and interpretability : What are the key diagnostics or checks to assess whether the model fits our data well and whether the results are interpretable? Additionally, when inspecting the model output, we are currently focusing on Rhat and effective sample size (ESS). Is it sufficient to check that Rhat values are close to 1 and that ESS values are around or above 1000, or are there other thresholds or diagnostics we should pay close attention to?
- Posterior predictive checks : When we run pp_check() to assess model fit visually, we observe a clear overrepresentation of very short latencies in the observed data compared to the model predictions. Is this problematic, and how should we interpret or address it?
Please find attached a pp_check plot showing the overrepresentation of very short latencies.
Additionally, some of our latency values are right-censored (when subjects did not explore the stimulus within the maximum observation time). I am still unsure how to properly account for this in the model, and how it might affect the interpretation of the results.
Any guidance or brief suggestions would be immensely appreciated, thank you very much for your time and for your help !
- Operating System: Windows 10
- brms Version: 2.22.0