Calculate indirect effects for a longitudinal Gaussian Process mediation model

Hello,

I am running a longitudinal mediation model with both the independent and the mediatior variables being time-varying (for 6 time points), using Gaussian Process in brms.

Would be grateful if anyone could help me with the correct way to calculate indirect effects (knowing the procedure is not as straightforward as it would be for a simpler multilevel mediation model without gp). The model is described below.

Many thanks!

a<-bf(M ~ gp(time) + gp(time, by=X) + (1|id))
b<-bf(Y ~ gp(time) + gp(time, by=M) + gp(time, by=X) + (1|id))
mediation_model<-brm(a + b + set_rescor(FALSE), control = list(adapt_delta = 0.95), init = 0, iter = 5000,warmup = 2500, core=4, seed = 007, data = data)

In the meantime I figured out a way to run a longitudinal mediation model with Gaussian Process in brms. The process should be different from classic mediation models such as the ones used in clinical research where X (the predictor) is usually the Treatment variable (coded as 0 or 1 for instance). In my model X (and also M and Y) are continous, so I wouldn’t go for adopting a counterfactual mediation framework (e.g. the one seen in mediation package).

My data is in long format with 6 assessment points over time for all variables of interest (over a period of 12 months). The hypothesised mediation effect here is that X is indirectly associated with Y via M, with M having a longitudinal mediation effect. Here it is my code for it:

a<-bf(M ~ gp(time) + gp(time, by=X, k = 5, c = 5/4) + (1|id))
b<-bf(Y ~ gp(time) + gp(time, by=M, k = 5, c = 5/4 ) + gp(time, by=X, k = 5, c = 5/4) + (1|id))

## I used k = 5 because there are only 6 time points and I am not particularly interested in modelling multiple hypothetical time points betwen them.

mediation_model<-brm(a + b + set_rescor(FALSE), control = list(adapt_delta = 0.95), init = 0, iter = 5000,warmup = 2500, core=4, seed = 123, data = data)

## Initially I used default priors and then weakly-informative priors (strongly recommended). Also because in the first attempt I got >300 divergent transitions after warmup I really had to improve my model. In my case it worked really well to increase adapt_delta to 0.99 with max_treedepth=15 and to adopt wekly-informative priors instead (no divergent transistions in the end).

In order to account for continuous X and M the idea was to to create two prediction datasets from my data: one as-it-is, and another one bumped up by a tiny amount (delta=0.001). Delta is used to compute numerical derivatives using a tiny increment. Differences in predictions give us the slopes across observed data points.

# Define a tiny change for the numerical derivative
delta <- 0.001


# Re-create the grids ensuring all relevant columns exist and match types perfectly

grid_base <- data %>% 
  select(id, time, X, M, Y) %>%
  mutate(M = as.numeric(M)) ## had to do that to make sure the variable is treated as continuous

grid_X_bump <- grid_base %>% 
  mutate(X = as.numeric(X) + delta)

grid_M_bump <- grid_base %>% 
  mutate(M = M + delta)

# PATH A: Derivative of M with respect to X (dM/dX) over time
# Predict M for base X and bumped X
# Force brms to isolate the specific multivariate model block for M
pred_M_base <- posterior_epred(mediation_model, resp = "M", newdata = grid_base)
pred_M_bump <- posterior_epred(mediation_model, resp = "M", newdata = grid_X_bump)

# Check the dimensions and see if they are genuinely identical
print(all(pred_M_base == pred_M_bump))

# Calculate the slope (dM/dX) for every posterior draw and observation
path_a_draws <- (pred_M_bump - pred_M_base) / delta

# PATH B: Derivative of Y with respect to M (dY/dM) over time
# Predict Y for base M and bumped M
pred_Y_base <- posterior_epred(mediation_model, resp = "Y", newdata = grid_base)
pred_Y_bump <- posterior_epred(mediation_model, resp = "Y", newdata = grid_M_bump)

print(all(pred_Y_base == pred_Y_bump))

# Calculate the slope (dY/dM) for every posterior draw and observation
path_b_draws <- (pred_Y_bump - pred_Y_base) / delta

# INDIRECT EFFECT: Multiply slopes and aggregate by Time Point
# Multiply the two paths point-by-point for every draw
indirect_draws <- path_a_draws * path_b_draws

# Map the columns of our matrix back to their respective time points
time_vectors <- grid_base$time

# Convert to long format, average over participants per draw, and get intervals
indirect_summary <- indirect_draws %>%
  as_tibble(.name_repair = "unique") %>%
  # Add posterior draw IDs so we can group by them
  mutate(draw = row_number()) %>%
  pivot_longer(cols = -draw, names_to = "obs_index", values_to = "value") %>%
  # Link each observation index back to its corresponding time point
  mutate(time = time_vectors[as.numeric(str_remove(obs_index, "..."))]) %>%
  # Step 1: Average across all participants within each draw + time point
  group_by(draw, time) %>%
  summarise(mean_indirect = mean(value), .groups = "drop") %>%
  # Step 2: Compute Bayesian credible intervals across the posterior draws
  group_by(time) %>%
  median_qi(mean_indirect, .width = 0.95)

print(indirect_summary)

I used the same procedure to calculate direct and total effects and the proportion mediated effect.

I acknowledged there are other ways to do this and it will always depend on your study hypothesis and your data. For instance if there are multiple time points it might be helpful to adjust some hyperparameters and one wants to model uncertainty between time points it might be considered creating a dense prediction grid instead… Hope this helps and some feedback and different perspectives would be really welcome!