I’m getting strange results from fitting a non-linear model with brms based on proportion data in R. Reproducible example with data and code below:
library(ggplot2)
library(brms)
library(tidybayes)
# simulate data
df <- data.frame(
time = time,
s1 = (round(dnorm(time, mean = 1.5, sd = 1) * 80)) + round(rnorm(49, 0, 0.5)),
s2 = (round(dnorm(time, mean = 4, sd = 2) * 90)) + round(rnorm(49, 0, 0.5)),
s3 = (round(dnorm(time, mean = 3, sd = 1.5) * 70)) + round(rnorm(49, 0, 0.5)),
s4 = (round(dnorm(time, mean = 2, sd = 1) * 85)) + round(rnorm(49, 0, 0.5)),
s5 = (round(dnorm(time, mean = 2.5, sd = 1) * 75)) + round(rnorm(49, 0, 0.5))
) %>%
pivot_longer(cols = s1:s5, names_to = "replicate", values_to = "state") %>%
dplyr::group_by(replicate) %>%
dplyr::mutate(state = abs(state)) %>%
dplyr::mutate(cum_state = cumsum(state)) %>%
filter(time < 24)
ggplot() +
geom_line(data = df, aes(time, cum_state, color = replicate), show.legend=TRUE) +
geom_point(data = df, aes(time, cum_state, fill = replicate), shape=21, size=2, show.legend=TRUE)
Where cum_state is the cumulative proportion of 100 individuals transitioning from state A to state B each hour, and S1-S5 are replicates. I’ve fit a four parameter logistic curve to to the data, where the lower asymptote is set at close to zero (0.01):
priors <- prior(normal(75, 25), nlpar = "asymptote") + # set broad priors
prior(normal(60, 30), nlpar = "midpoint") +
prior(normal(4.5, 2.4), nlpar = "scale")
four_parameter <- cum_state ~ (asymptote) + (0.01 - (asymptote)) / (1 + ((time / (midpoint))^(scale)))
model_4param <- brm(bf(four_parameter, asymptote + midpoint + scale ~ 1, nl = TRUE),
data = df, prior = priors, control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)
plot(conditional_effects(model_4param), points = TRUE)
I’m happy with the model fit and outcome, but I’d like to predict the outcome on an individual basis (i.e. for any randomly selected individual from the 100. My understanding of this was to fit a binomial model specifying the number of trials (out of 100 individuals):
four_parameter_trials <- cum_state | trials(100) ~ (asymptote) + (0.01 - (asymptote)) / (1 + ((time / (midpoint))^(scale)))
model_4param_trials <- brm(bf(four_parameter_trials, asymptote + midpoint + scale ~ 1, nl = TRUE), family = binomial(),
data = df, prior = priors, control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)
plot(conditional_effects(model_4param_trials), points = TRUE)
The outcome here is obviously quite different and incorrect, as brms predicts the lower asymptote as 0.5:
Q1: Where am I going wrong in specifying the model? Running this as a standard binomial model works fine, so the specification of trials seems correct:
model_logistic <- brm(cumulative_settled | trials(100) ~ time, family = binomial(), data = df, chains = 4, iter = 10000)
plot(conditional_effects(model_logistic), points = TRUE)
Q2: How can I predict from model_4param what the outcome will be on an individual basis? My current approach was to use the output from conditional_effects and divide by the number of individuals (100) to get a proportion, and then use this with rbinom to get an individual probability. For example, if the cumulative number of individuals that transitioned state at t=10 was 75, then the probability at the individual would be rbinom(1,1,0.75). Is this approach valid? Ultimately what I’d like to do is predict from the model for any individual at what time (in minute intervals) they switch from one state to the other.
Q3: Not entirely related to this question: how can I account for replicates in the non-linear model? using add_epred_draws() with model_4param gives predictions for each replicate level. Each timepoints is nested within a replicate, but I’m after a global effect here in the epred draws, not accounting for each replicate.
df %>%
data_grid(time = seq_range(time, n = 200)) %>%
add_epred_draws(model_4param, ndraws = 1) %>% View()
Please also provide the following information in addition to your question:
- Operating System: MacOS 13.2.1
- brms Version: 2.18.0