Fitting a model like
fit <- brm(y ~ Group * mo(Session) + (mo(Session) | IDanon), data = my_data)
with a continuous outcome and bivariate grouping (Group; factor) and ordinal time covariates (Session; ordered), results in a few problematic observations. Refitting the model with problematic obs. left out, as in:
fit <- add_criterion(fit, "loo", reloo = TRUE)
results in an error always during the refit of the last model, for instance:
10 problematic observation(s) found.
The model will be refit 10 times.
Fitting model 1 out of 10 (leaving out observation 37)
Fitting model 2 out of 10 (leaving out observation 84)
.
.
.
Fitting model 10 out of 10 (leaving out observation 228)
Start sampling
Fehler in draws$Csp[[i]] <- as_draws_matrix(draws$Csp[[i]], dim = dim) :
mehr Elemente gegeben als zu ersetzen sind
After fitting the model again, (obviously) a different number of problematic obs. will ocurr. When refitting with add_criterion()
, the same error ocurrs with the last fit.
Replacing “loo” with “waic” in the call to add_criterion
does not throw an error.
When fitting the model without the mo()
function, add_criterion
does not throw an error, but the number of problematics obs. increases from between 5 and 10 to way above 200.
Please also provide the following information in addition to your question:
- Operating System: Debian 9 (old stable)
- brms Version: 2.10.3
I’d be grateful for any hints and suggestions to solve this problem.
M.
Can you send me a minimal reproducible example for me to try out?
Sure can. Just took a while until I figured out how to reproduce the error in a made up data set. The error can only be reproduced when there is (unbalanced?) missing data over time, i.e. participants dropping out after an individual number of repeated measurements.
Am I missing something here, or is this behavior unintended?
library(brms)
library(dplyr)
library(ggplot2)
id <- 1:40 # participants
n_ids <- length(id)
group <- c(0, 1) # control & treatment groups
time <- 0L:15L # repeated measurement; 0 = baseline
n_measur <- length(time)
set.seed(987501924)
contrived_df <- data.frame(ID = rep(id, each = n_measur),
Group = rep(sample(group,
size = n_ids,
replace = TRUE),
each = n_measur),
Time = rep(time, times = n_ids))
contrived_df <- mutate(contrived_df,
y = rnorm(n_ids, sd = 0.5) * ID +
4 * Group +
0.3 * Time +
2 * (Group * Time))
contrived_df <- group_by(contrived_df, ID)
# simulate increasing drop-out prob. over time:
contrived_df <- filter(contrived_df,
Time <= sample(n_measur - 1,
size = 1,
prob = dexp(1:(n_measur - 1),
rate = 0.4)))
ggplot(tally(contrived_df), aes(n)) +
geom_density() +
geom_rug(alpha = 0.4) +
xlab("Time") +
ggtitle("Distribution of the number of repeated measurements")
ggplot(contrived_df, aes(Time,
y,
color = factor(Group))) +
geom_jitter(alpha = 0.5)
fit <- brm(y ~ Group * mo(Time) +
(mo(Time) | ID),
data = contrived_df,
control = list(adapt_delta = .95))
fit <- add_criterion(fit, "loo", reloo = TRUE)
Cheers, Michael
Thank you! I think I have an idea of what the problem could be.
Monotonic effects do not allow for predictions of predictor values that are outside of the range of the original data values. So if in the original data (training data in kfold), Time was between, say, 0 and 10 but in the new data (test data in kfold), time was, say, 11 then we would not be able to make predictions and any attend would result in an error.
This error is just not visible in kfold as it uses futures to potential parallelization and it tends to hide certain error messages and thus makes debugging harder.
Great, thanks for the swift response and the clarification! Guess I have to input my rep meas as integers (or factor()-ize it) instead to be able to use add_criterion().
I don’t understand your solution I am afraid, but please tell me if it worked out.