Find the predictor value at which posterior_epred(model) = x (in a multilevel logistic regression)

Ladislas · January 29, 2025, 8:01pm

Context

Hello,

I am modelling behavioural data with a multilevel model specified as:

model <- brm(
    formula = resp ~ 1 + test * stim * trial + (1 + test * stim * trial | dyad / ppt),
    family = bernoulli(link = "logit"),
    data = df
    )

where resp is thus a binary (0/1) variable, test is a binary (factor) predictor with two levels (“individual” and “interactive”, contrast-coded as -0.5/+0.5), stim is a numerical variable, trial is also a numerical variable, ppt refers to the participant’s ID and dyad refers to the dyad in which the participant belongs (therefore ppt is nested within dyad).

I am particularly interested in the interaction effect between stim and test, for all possible values of trial. With some help from the brms::conditional_effects() function, I can retrieve the estimates of p(resp) for each crossing of stim, test, and trial (i.e., the output of posterior_epred()) using the following code (where the trial is represented in colour, and test levels in columns):

cond_effects <- conditional_effects(
    model,
    conditions = data.frame(trial = unique(model$data$trial) ),
    effects = "stim:test",
    method = "posterior_epred",
    plot = FALSE
    )[[1]]

cond_effects %>%
    ggplot(aes(x = stim, y = estimate__, color = as.factor(trial) ) ) +
    geom_hline(yintercept = 0.5, linetype = 2) +
    geom_line(show.legend = FALSE) +
    facet_wrap(~test) +
    scale_color_viridis_d() +
    theme_bw() +
    labs(
        x = "Standardised trial",
        y = "Predicted probability",
        color = "Trial"
        )

Then, I am interested in the same effect for each participant, which I can also retrieve using conditional_effects(). Below is an example for two participants (in columns, and for the two test levels, in rows).

# creating a grid of values for predictions for only two participants
conditions <- crossing(
    trial = unique(df_trial$trial),
    ppt = c("6_06_A", "6_06_B")
    )

# computing conditional effects for each participant and trial
cond_effects <- conditional_effects(
    x = trial_model,
    effects = "stim:test",
    conditions = conditions,
    method = "posterior_epred",
    re_formula = NULL,
    resolution = 500,
    plot = FALSE
    )[[1]]

The problem

Now, I want to compute the value of stim for which p(resp) = 0.5 (i.e., posterior_epred(model)=0.5, for each ppt, test, and trial.

If we ignore the random/varying effects (intercepts and slopes), and if we only had one slope, this would be given by -\alpha / \beta (i.e., minus the intercept divided by the slope). But because the model involves several interactions and random/varying slopes, I am struggling a bit to compute this “threshold” (as called in Kruschke’s book, adapted to brms here: 21 Dichotomous Predicted Variable | Doing Bayesian Data Analysis in brms and the tidyverse).

What I have tried

I have tried to extend the analytical solution above to include interactions and varying slopes, but I am not fully satisfied with the results (they look weird), so I am wondering whether there is any implemented function that could help?

I have also tried to numerically approximate the value of stim for which p(resp)=0.5 from the output of conditional_effects(), but that is not fully satisfying either (it is not very precise and computationally intensive).

Do you have any idea how to compute this in a more principled way? In brief, I am looking for something to perform the inverse operation that is performed by posterior_epred(). Instead of computing the value of p(resp) given some model parameters, I want to compute the value of stim for which p(resp)=0.5 (for all possible combinations of test, ppt, and trial).

Thank you in advance for your help!

Ladislas

matti · January 30, 2025, 8:07am

Hi @Ladislas yeah this is a very useful calculation and sometimes confusing with lots of interactions! How have you calculated this for just one predictor? If you use index variables (ie estimate the slope for all conditions vs their differences) would that help?

Ladislas · January 31, 2025, 12:57pm

Hi,

thank you very much for your reply!

The solution -\alpha/\beta comes from Kruschke’s book and Solomon Kurz’ translation in brms (see link in my original post), but it’s a general property of the logit function I guess (solving for logit(p=0.5) = 0) (this is the point of the steepest slope, also known as the “median effective level”).

You are right that using an index variable may facilitate the computation, although I am not sure to understand how that would be different from retrieving the predictions using conditional_effects()? I already have retrieved the predictions (i.e., p(resp)) in each condition of interest. What I don’t know, is how to compute is the value of stim for which p(resp)=0.5 (in each condition).

Ladislas

matti · January 31, 2025, 6:37pm

What I meant is that if for each condition z (the index) you have alphaz and betaz computing the point of subjective equality for each z is trivial.

Ladislas · February 2, 2025, 9:55am

Oh, OK I got it now, sorry for the misunderstanding. In this situation, this would entail estimating the intercept and slope for each trial (and test and participant), thus going from estimating only one slope for trial (currently treated as a numeric variable) to as many slopes as there are trials, if I understood correctly. I am not sure this sort of model would converge, but I can try!

matti · February 10, 2025, 4:32pm

I see. this would be nice to do with hypothesis() because it automatically calculates this for different grouping factors using class and group arugments. Something like this

dat <- crossing(
  test = factor(c("a", "b")),
  stim = -5:5,
  trial = -10:10
)
dat <- dat |> 
  mutate(
    resp = rbinom(n(), 1, pnorm(stim * .2 + trial * -.02))
  )
dat |> 
  ggplot(aes(stim, resp, col = trial, group = trial)) +
  geom_smooth(
    method = "glm", se = F,
    method.args = list(family = binomial)
  )
fit <- brm(
  resp ~ 1 + test * stim * trial, 
  data = dat, 
  family = bernoulli(),
  cores = 4,
  backend = "cmdstanr",
  file = "tmp-brm-something"
)

summary(fit)

And then this (warning I forget how to add the interaction terms lol check your math):

ce <- conditional_effects(
  fit,
  conditions = tibble(trial = c(0, 5)),
  effects = "stim:test",
  robust = FALSE,
  plot = FALSE
)[[1]] |> 
  tibble()

h <- hypothesis(
  fit,
  c(
    "pse_testa_trial0" = "-(Intercept / stim) = 0",
    "pse_testb_trial0" = "-((Intercept + testb) / (stim + testb:stim)) = 0",
    "pse_testa_trial5" = "-((Intercept + trial*5) / 
      (stim + stim:trial*5)) = 0",
    "pse_testb_trial5" = "-((Intercept + testb + trial*5) / 
      (stim + testb:stim + testb:trial*5 + stim:trial*5 + testb:stim:trial*5)) = 0"
  )
)

h <- tibble(h$hypothesis[,1:5]) |> 
  mutate(
    stim = 0,
    test = c("a", "b", "a", "b"),
    trial = c(0, 0, 5, 5)
  )

ce |> 
  ggplot(aes(stim, estimate__)) +
  geom_line() +
  geom_vline(
    data = h, lty = "dashed",
    aes(xintercept = Estimate)
  ) +
  facet_grid(test~trial, labeller = label_both)

As you can see you can create the PSE for each combination of X like this but please do better with adding the interactions (trial 5 test b is wrong as you can see) :)

Coding-wise it would be cleaner to use something like tidybayes::spread_draws()…

Let me know how it goes

zacho · February 21, 2025, 4:38pm

I’ve been following this thread with interest because it’s pretty similar to a problem I’ve been working on with a binomial model. I don’t know the clever / programmatic way to do this (perhaps something with the model matrix?) but it’s pretty much just algebra + patience to do the dumb method below. It uses spread_draws as @matti suggests.

Using @matti’s example fit:

library(tidybayes)

# note order of b_ coef's is same as in model matrix
coef_draws <- spread_draws(fit, `b_.*`, 
             # `sd_.*`, `cor_.*`, # for varying effects etc.
             regex = T, ndraws = 4000)

# combine with a grid of interest:
expand_grid(
  test = factor(c("a", "b")),
  trial = -5:5,
  coef_draws
) %>% 
  mutate(
    # dummy for test b
    x_b = if_else(test == "b", 1, 0),
    # the value of stim to yield logit(0.5)
    x_stim_for_0.5 = (qlogis(0.5) - b_Intercept - b_testb * x_b - b_trial * trial - 
      `b_testb:trial` * trial * x_b) / (
        b_stim + (`b_testb:stim` * x_b) + (`b_stim:trial` * trial) + 
          (`b_testb:stim:trial` * x_b * trial)
      )
  ) %>% 
  # plot
  ggplot(aes(trial, x_stim_for_0.5, fill = test, color = test)) + 
  ggdist::stat_dist_halfeye(alpha = 0.3)

Essentially arrange the linear model to solve for the stim predictor. I.e.:
{logit(0.5) - all the stuff not associated with stim} / {all the stuff with stim but with stim factored out}

Note this works just as well if you varying intercepts/slopes – that spread_draws call can also grab the relevant sd_ and cor_ parameters. Though if you do have cor_ terms, that means you have to simulate the varying effects from the multivariate normal distribution… which I also did in a dumb manner :^)

I’d be keen to know if there’s a more programmatic method for this (e.g. if one wanted to know the same thing but for trial without the manual re-arrangement of model terms).

matti · February 25, 2025, 3:16pm

Beautiful, that’s kinda what I had in mind but didn’t have the tidybayes chops / patience to figure out.

Ladislas · February 25, 2025, 3:30pm

Thank you both for your help!

I have also tried computing the PSE “manually” from the model’s estimates (see some messy code below), but problems come when considering the random/varying effects… here I am a bit uncertain about my implementation, and as @zacho said, it would be good to know whether there is a more programmatic way of achieving this.

Best wishes,

Ladislas

# retrieving the fixed/constant effects
fixed_effects <- fixef(trial_model)

# retrieving the random/varying effects for dyad and ppt
# remember that (1 | dyad / ppt) is equivalent to (1 | dyad) + (1 | dyad:ppt)
random_effects_dyad <- ranef(trial_model)$dyad
random_effects_ppt <- ranef(trial_model)$`dyad:ppt`

# creating a data frame with test, dyad, participant (ppt), trial
results <- df_trial %>%
    select(-ppt) %>%
    rename(ppt = `dyad:ppt`) %>%
    distinct(test, dyad, ppt, trial) %>%
    mutate(dyad = as.character(dyad) ) %>%
    mutate(trial = as.numeric(trial), test = ifelse(test == "individual", -0.5, 0.5) ) %>%
    rowwise() %>%
    mutate(
        # fixed/constant effects
        intercept = fixed_effects["Intercept", "Estimate"],
        test_effect = fixed_effects["test1", "Estimate"] * test,
        trial_effect = fixed_effects["trial", "Estimate"] * trial,
        test_trial_effect = fixed_effects["test1:trial", "Estimate"] * test * trial,
        stim_effect = fixed_effects["stim", "Estimate"],
        test_stim_effect = fixed_effects["test1:stim", "Estimate"] * test,
        stim_trial_effect = fixed_effects["stim:trial", "Estimate"] * trial,
        test_stim_trial_effect = fixed_effects["test1:stim:trial", "Estimate"] * test * trial,
        # random/varying effects (dyad-level)
        random_intercept_dyad = random_effects_dyad[, , "Intercept"][dyad, "Estimate"],
        random_test_dyad = random_effects_dyad[, , "test1"][dyad, "Estimate"] * test,
        random_trial_dyad = random_effects_dyad[, , "trial"][dyad, "Estimate"] * trial,
        random_stim_dyad = random_effects_dyad[, , "stim"][dyad, "Estimate"],
        random_test_stim_dyad = random_effects_dyad[, , "test1:stim"][dyad, "Estimate"] * test,
        random_stim_trial_dyad = random_effects_dyad[, , "stim:trial"][dyad, "Estimate"]  * trial,
        random_test_stim_trial_dyad = random_effects_dyad[, , "test1:stim:trial"][dyad, "Estimate"]  * test * trial,
        # random-varying effects (ppt-level, nested in dyad)
        random_intercept_ppt = random_effects_ppt[, , "Intercept"][ppt, "Estimate"],
        random_test_ppt = random_effects_ppt[, , "test1"][ppt, "Estimate"] * test,
        random_trial_ppt = random_effects_ppt[, , "trial"][ppt, "Estimate"] * trial,
        random_stim_ppt = random_effects_ppt[, , "stim"][ppt, "Estimate"],
        random_test_stim_ppt = random_effects_ppt[, , "test1:stim"][ppt, "Estimate"] * test,
        random_stim_trial_ppt = random_effects_ppt[, , "stim:trial"][ppt, "Estimate"] * trial,
        random_test_stim_trial_ppt = random_effects_ppt[, , "test1:stim:trial"][ppt, "Estimate"]* test * trial,
        # combining fixed and random effects for the numerator
        numerator = intercept +
            test_effect + trial_effect + test_trial_effect +
            random_intercept_dyad + random_test_dyad + random_trial_dyad +
            random_stim_dyad + random_test_stim_dyad + random_stim_trial_dyad + random_test_stim_trial_dyad +
            random_intercept_ppt + random_test_ppt + random_trial_ppt +
            random_stim_ppt + random_test_stim_ppt + random_stim_trial_ppt + random_test_stim_trial_ppt,
        # combining fixed and random effects for the denominator
        denominator = stim_effect +
            test_stim_effect + stim_trial_effect + test_stim_trial_effect +
            random_stim_dyad + random_test_stim_dyad + random_stim_trial_dyad + random_test_stim_trial_dyad +
            random_stim_ppt + random_test_stim_ppt + random_stim_trial_ppt + random_test_stim_trial_ppt,
        # computing stim at which p(resp) = 0.5
        stim_at_p_0.5 = -numerator / denominator
        ) %>%
    ungroup()

Topic		Replies	Views
Predictions from a cumulative proportional odds model (ordinal logistic regression) Modeling brms	18	1520	October 24, 2022
Question on betabinomial family Modeling brms	25	802	September 6, 2023
How to compute expected value of the posterior predictive distribution (epred) Modeling specification	17	1176	July 31, 2023
Question re brms: conditional_effects General brms	4	1248	July 4, 2022
Probability that the response is below a given value, conditional on x in a hurdle-lognormal model with spline brms posterior-predictive , splines , brms	4	695	May 3, 2021

Find the predictor value at which posterior_epred(model) = x (in a multilevel logistic regression)

Context

The problem

What I have tried

Related topics