Specifying trials in a binomial non-linear model (four-parameter logistic) giving strange results

I’m getting strange results from fitting a non-linear model with brms based on proportion data in R. Reproducible example with data and code below:

library(ggplot2)
library(brms)
library(tidybayes)

# simulate data 
df <- data.frame(
  time = time,
  s1 = (round(dnorm(time, mean = 1.5, sd = 1) * 80)) + round(rnorm(49, 0, 0.5)),
  s2 = (round(dnorm(time, mean = 4, sd = 2) * 90)) + round(rnorm(49, 0, 0.5)),
  s3 = (round(dnorm(time, mean = 3, sd = 1.5) * 70)) + round(rnorm(49, 0, 0.5)),
  s4 = (round(dnorm(time, mean = 2, sd = 1) * 85)) + round(rnorm(49, 0, 0.5)),
  s5 = (round(dnorm(time, mean = 2.5, sd = 1) * 75)) + round(rnorm(49, 0, 0.5))
) %>%
  pivot_longer(cols = s1:s5, names_to = "replicate", values_to = "state") %>%
  dplyr::group_by(replicate) %>%
  dplyr::mutate(state = abs(state)) %>%
  dplyr::mutate(cum_state = cumsum(state)) %>%
  filter(time < 24)  

ggplot() +
  geom_line(data = df, aes(time, cum_state, color = replicate), show.legend=TRUE) +
  geom_point(data = df, aes(time, cum_state, fill = replicate), shape=21,  size=2, show.legend=TRUE) 

Where cum_state is the cumulative proportion of 100 individuals transitioning from state A to state B each hour, and S1-S5 are replicates. I’ve fit a four parameter logistic curve to to the data, where the lower asymptote is set at close to zero (0.01):

priors <- prior(normal(75, 25), nlpar = "asymptote") + # set broad priors
   prior(normal(60, 30), nlpar = "midpoint") +
   prior(normal(4.5, 2.4), nlpar = "scale")

four_parameter <- cum_state  ~ (asymptote) + (0.01 - (asymptote)) / (1 + ((time / (midpoint))^(scale)))

model_4param <- brm(bf(four_parameter, asymptote + midpoint + scale ~ 1, nl = TRUE),
  data = df, prior = priors, control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)

plot(conditional_effects(model_4param), points = TRUE)

I’m happy with the model fit and outcome, but I’d like to predict the outcome on an individual basis (i.e. for any randomly selected individual from the 100. My understanding of this was to fit a binomial model specifying the number of trials (out of 100 individuals):

four_parameter_trials <- cum_state | trials(100) ~ (asymptote) + (0.01 - (asymptote)) / (1 + ((time / (midpoint))^(scale)))

model_4param_trials <- brm(bf(four_parameter_trials, asymptote + midpoint + scale ~ 1, nl = TRUE), family = binomial(), 
  data = df, prior = priors, control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)

plot(conditional_effects(model_4param_trials), points = TRUE)

The outcome here is obviously quite different and incorrect, as brms predicts the lower asymptote as 0.5:

Q1: Where am I going wrong in specifying the model? Running this as a standard binomial model works fine, so the specification of trials seems correct:

model_logistic <- brm(cumulative_settled | trials(100) ~ time, family = binomial(), data = df, chains = 4, iter = 10000)
plot(conditional_effects(model_logistic), points = TRUE)

Q2: How can I predict from model_4param what the outcome will be on an individual basis? My current approach was to use the output from conditional_effects and divide by the number of individuals (100) to get a proportion, and then use this with rbinom to get an individual probability. For example, if the cumulative number of individuals that transitioned state at t=10 was 75, then the probability at the individual would be rbinom(1,1,0.75). Is this approach valid? Ultimately what I’d like to do is predict from the model for any individual at what time (in minute intervals) they switch from one state to the other.

Q3: Not entirely related to this question: how can I account for replicates in the non-linear model? using add_epred_draws() with model_4param gives predictions for each replicate level. Each timepoints is nested within a replicate, but I’m after a global effect here in the epred draws, not accounting for each replicate.

df %>%
  data_grid(time = seq_range(time, n = 200)) %>%
  add_epred_draws(model_4param, ndraws = 1) %>% View()

Please also provide the following information in addition to your question:

  • Operating System: MacOS 13.2.1
  • brms Version: 2.18.0

When you use family = binomial(), by default brms assumes a logit link. By construction, your nonlinear predictor never goes below 0, so its logit never goes below 0.5. You can avoid this by manually specifying an identity link binomial(link = "identity").

The approach you outline here is valid for one-off prediction about whether a transition occurs by time t, but that does not directly yield a sampling distribution for t. For example, suppose the cumulative probability of a transition by time t = 10 is 0.5, and the cumulative probabilty by t = 11 is 0.501 (i.e. almost no individuals ever transition between t=10 and t=11). If we sample from rbinom(1,1,0.5) and get a zero at t = 10, we need a way to understand that we are overwhelmingly likely to still be a zero at t=11. Sampling from rbinom(1,1,0.501) won’t capture that. If you want to sample t directly, then one procedure that achieves this is: sample a uniform random variate \chi between 0 and 1. If \chi < l, where l is the value of the lower asymptote (divided by 100), then the transition time for that individual is negative infinity. If \chi > u, where u is the upper asymptote (divided by 100), then the transition time is positive infinity. Otherwise, the transition time is f^{-1}(\chi), where f is a function giving the fitted value of cum_state divided by 100, and f^{-1} is the inverse of f.

Add a random effect (or if desired a fixed effect) of replicate on each of the fitted nonlinear parameters. If treating the effects as random, you’ll likely want to allow them to correlate. To specify correlated random effects across multiple formulae, use the |<ID>| syntax from brms (see page 3 here)

Some aspects of your inference are probably sensitive to the choice of this asymptote, and I’d suggest doing some sensitivity analysis around your choice to fix it to 0.01.

You can avoid this by manually specifying an identity link binomial(link = "identity") .

I’ve stared at this code for quite some time and this never occurred to me. Thank you for highlighting the oversight!

model_4param_trials <- brm(bf(four_parameter_trials, asymptote + midpoint + scale ~ 1, nl = TRUE), family = binomial(link = "identity"), 
  data = df, prior = priors, control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)

Works correctly.

The approach you outline here is valid for one-off prediction about whether a transition occurs by time
t, but that does not directly yield a sampling distribution for t… Otherwise, the transition time is f−1(χ), where f is a function giving the fitted value of cum_state divided by 100.

I think I’m following here. As individuals can not transform back from state A to state B, predicting transitions for an individual would be:

P_individual <- df %>%
  data_grid(time = seq_range(time, n = (24))) %>%
  add_epred_draws(model_4param, ndraws = 1) %>%
  filter(replicate=="s1") %>%
  mutate(proportion = .epred/100) %>% 
  mutate(individual = as.numeric(rbinom(1, size = 1, prob = proportion))) %>% 
  ungroup() %>%
  mutate(individual = replace(individual, cumsum(individual) > 0, 1))

Add a random effect (or if desired a fixed effect) of replicate on each of the fitted nonlinear parameters. If treating the effects as random, you’ll likely want to allow them to correlate. To specify correlated random effects across multiple formulae, use the || syntax from brms

Thanks for this, I was unaware of the multilevel formula syntax in brms. Would I be specifying a correlated random error structure for the above example correctly here? (where replicates are modeled as correlated as they share same grouping factor time)?

 brm(bf(four_parameter_trials, 
  midpoint ~ (1|replicate|time), 
  scale ~ (1|replicate|time), 
  asymptote ~ (1|replicate|time), nl = TRUE),
  family = binomial(link = "identity"), data = df, prior = priors, 
  control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)

With the above example brms is giving the following error:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: binomial_lpmf: Probability parameter[11] is nan, but must be finite!  (in 'model8b5d3e53aaec_4cbbb527b3651d4c8a82a109fa24cd46' at line 98)```

What do you mean by “the outcome” here? My initial guess was that you meant the time at which an individual transitions. Alternatively, you could mean whether or not the individual has transitioned by some fixed time. Or perhaps something else entirely…

Hi there @ecologist,

The way you’ve specified the correlated random (or group) effects, (1 | replicate | time) isn’t quite right. The |<ID>| syntax that @jsocolar referenced works a bit differently. The brms multivariate model vignette gives a nice demonstration of how this syntax works. The short version is that you can pick an arbitrary name that is not a variable in the data set. Typical choices are a letter in my experience. This arbitrary name serves as a book-keeping device to handle the correlations of the group effects across parameters, responses, or wherever the correlated group-effects manifest. So for your case, you might use something like:

 brm(bf(four_parameter_trials, 
  midpoint + scale + asymptote ~ (1| w | replicate), nl = TRUE),
  family = binomial(link = "identity"), data = df, prior = priors, 
  control = list(adapt_delta = 0.999, max_treedepth = 20), chains = 4, iter = 10000)

Also, as an aside, I note that you’ve set adapt_delta to be quite high/close to 1, as well as ramped up max_treedepth to 20. To my eye this suggests that the geometry of your posterior is quite difficult for HMC to explore with your current parameterization and that you might benefit from reparameterizing your model in some way. I appreciate that the time-cost of delving deeper and likely needing to work directly in STAN may not be worth it in this particular case, but wanted to point out that you could likely gain some computational speed/efficiency this way.