Hi everyone,
BRMS novice here – neuroscience PhD student investigating oral ketamine treatment for mental health and EEG-derived information theoretic metrics as prognostic biomarkers.
I’m hoping to get some guidance on my mixed effects model specification and validation for EEG-derived metrics of signal complexity. I’m about two-months into the self-teaching journey and my brain is teetering on mental anguish haha! Foolishly, I left asking for advice this long.
Bit of relevant background: 6-week treatment of low-dose ketamine given to participants with suicidality. Participants were classified as responders or non-responders at post-treatment and follow-up. EEG collected in two conditions (eyes closed and open) across three timepoints (baseline, 6-weeks (post-treatment), and 10-weeks (follow-up)). For each task and timepoint, two complexity (signal irregularity/diversity) metrics were obtained; Lempel-Ziv Complexity and Multiscale entropy (MSE). My Lempel-Ziv BRMS model is straight forward, however, the unique (and difficult) thing with MSE is it has a scale factor; effectively coarse graining the original time series to obtain (sample) entropy estimates that (supposedly) index dynamics occurring across fast and slower timescales (see this paper for methodology: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.89.068102).
The Multi-Scale Entropy has lower bound of 0 but no upper (although literature values rarely exceed 2). The value range depends on the specification of tolerance; essentially threshold for evaluating signal pattern similarities.
Replicating the inclusion of scale as a fixed factor in this paper (Neural complexity EEG biomarkers of rapid and post-rapid ketamine effects in late-life treatment-resistant depression: a randomized control trial | Neuropsychopharmacology), I specified a mixed effects model with random intercepts and slopes, and interactions between the factors timepoint, task, response status and the scale factor (1-10).
- Timepoints: ses-01 (PRE), ses-02 (POST), ses-03 (FUP)
- Task: Eyes closed (EC) and eyes open (EO)
- Response: Non-responder (0) and Responder (1)
- Scale: MSE 1-10
Script for creating reproducible example in long or wide format:
mockData.R (1.9 KB)
Please note I’ve kept the iterations low as I review model selection as I originally had at 10,000 but this was making the whole process cumbersome. I understand increasing iterations and hence ESS will improve model estimates, however, I want to address model specification before progressing to this.
MSE distribution:
Formula (Note: I’m not interested in random slope correlations):
data <- # read in long-format data
model <- brm (MSE | resp_trunc(lb = 0) ~ 1 + Responder * Timepoint * Task * Scale + (1 + Responder + Timepoint + Task + Scale || Subject),
data = data,
family = gaussian(),
chains = 4,
cores = 4,
iter = iterations,
warmup = warmup,
prior = prior_random,
init = init,
sample_prior = 'yes',
save_pars = save_pars(all = TRUE),
seed = 22,
# file = "MSE_model2.rda,
control = control
)
With the following priors:
prior_random = c(prior(normal(1.2, 0.05), class = 'Intercept', lb = 0),
prior(normal(0, 0.10), class = 'b'),
prior(student_t(3, 0.05, 0.05), class = 'sigma'),
prior(cauchy(0, 0.01), class = 'sd')
)
And model parameters (iterations low just for the model selection step):
iterations <- 2000
warmup <- iterations / 2
init <- 0
control <- list(
adapt_engaged = TRUE,
adapt_delta = 0.95,
stepsize = 0.05,
max_treedepth = 15)
The PP_check of the prior only model shows reasonable estimates:
Chain mixing is okay (no convergence warnings).
Summary (snippet for brevity):
Posterior predictive checks look as follows (stat estimates for other factors similar):
Min:
Max:
Mean:
My questions:
- Is it statistically/logically sound to include the scale factor in the model or should the different scale values of the MSE be treated as separate DVs in a multivariate model or separate models for each scale value? My use of MSE comes from this paper Correction: Neural complexity EEG biomarkers of rapid and post-rapid ketamine effects in late-life treatment-resistant depression: a randomized control trial - PMC, which applied this approach to a frequentist LME. As I’ve iterated through the model building process, the feeling that this isn’t ‘right’ has crept in. To be honest, if a multivariate was a appropriate this would be great as inclusion of scale makes the population effects somewhat unwieldy.
- Whilst the posterior predictive checks indicate the MSE model fit is decent, the min and max stat predictions are somewhat off (this is consistent across factors). Is this related to my model specification, and are there ways to improve this?
- Would a mixture model be best if I keep scale as a population effect? I originally dabbled into the mixture model to capture the bimodal nature of the MSE values, however, the gaussian model produced the posterior distribution checks above (which I thought was reasonable).
Please scrutinise and point out any errors, omissions, or imperfections as I wish to learn!
I’ve any further info is needed let me know.
Thank you!