Calculate the first derivative and it's posterior distribution of an estimated spline trend

dilsherdhillon · August 25, 2019, 4:45pm

Inspired by this post https://www.fromthebottomoftheheap.net/2014/05/15/identifying-periods-of-change-with-gams/, I’m trying to identify periods of change in a GAM model, using bayesian inference.

However, I find that my choice of epsilon (the delta difference between two points), is very sensitive to the resulting confidence intervals. If I choose a really small epsilon, which is what we should want and do, my intervals of the associated first derivative blow up. If this epsilon is bigger, the intervals are narrower.

I’m using the method of finite differences , where we estimate the trend at two points very close to each other, and calculate the difference between those two points to estimate the slope.

I’m clearly missing out on something while calculating the derivatives.

library(dplyr)
library(magrittr)
library(ggplot2)
library(brms)
library(mgcv)

set.seed(08242019)
dta <- mgcv::gamSim(eg = 4, n = 200, scale = 2)


mod <- brms::brm(
  y ~ s(x2, by = fac), data = dta, 
  chains = 4,
    cores = 4,
    prior = c(
      prior(student_t(3, 0, 1), class = sigma),
      prior(student_t(3, 0, 1), class = sds),
      prior(student_t(3, 0, 1), class = b),
      prior(student_t(3, 0, 5), class = Intercept)
))


epsilon <- 0.1

first <- expand.grid(x2 = seq(0, 0.99, length = 4000),
                    fac = c('1', '2','3'))

second <- expand.grid(x2 = seq(0 + epsilon, 0.99 + epsilon, length = 4000),
                    fac = c('1', '2','3'))

## get predictions
first_preds <- posterior_predict(mod, newdata = first, summary = FALSE)
second_preds <- posterior_predict(mod, newdata = second, summary = FALSE)

## calcualte the difference of the predictions  
diff <- (second_preds - first_preds)/epsilon

## using the posterior samples, we calculate the mean and lower and upper quantiles  
mean_finite_diff <- apply(diff, MARGIN = 2,FUN = mean)
lower_finite_diff <- apply(diff, MARGIN = 2,FUN = quantile, prob = 0.025)
upper_finite_diff <- apply(diff, MARGIN = 2,FUN = quantile, prob = 0.975)

data.frame(mean_finite_diff) %>%
  cbind(lower_finite_diff) %>%
  cbind(upper_finite_diff) %>%
  cbind(first) %>%
  ggplot(.,aes(x2, mean_finite_diff, group  = fac)) + geom_smooth(aes(color = fac), se = FALSE) + geom_ribbon(aes(ymin = lower_finite_diff, 
                                                                                                                  ymax = upper_finite_diff), alpha = 0.05)

However, when you decrease the epsilon,

epsilon <- 0.0001

first <- expand.grid(x2 = seq(0, 0.99, length = 4000),
                    fac = c('1', '2','3'))

#second <- expand.grid(x2 = seq(0 + epsilon, 0.99 + epsilon, length = 4000),
 #                   fac = c('1', '2','3'))


second <- first %>%
          mutate(x2 = x2 + epsilon)

## get predictions
first_preds <- posterior_predict(mod, newdata = first, summary = FALSE)
second_preds <- posterior_predict(mod, newdata = second, summary = FALSE)

## calcualte the difference of the predictions  
diff <- (second_preds - first_preds)/epsilon

## using the posterior samples, we calculate the mean and lower and upper quantiles  
mean_finite_diff <- apply(diff, MARGIN = 2,FUN = mean)
lower_finite_diff <- apply(diff, MARGIN = 2,FUN = quantile, prob = 0.025)
upper_finite_diff <- apply(diff, MARGIN = 2,FUN = quantile, prob = 0.975)


## plot

data.frame(mean_finite_diff) %>%
  cbind(lower_finite_diff) %>%
  cbind(upper_finite_diff) %>%
  cbind(first) %>%
  ggplot(.,aes(x2, mean_finite_diff, group  = fac)) + geom_smooth(aes(color = fac), se = FALSE) + geom_ribbon(aes(ymin = lower_finite_diff, 
                                                                                                                  ymax = upper_finite_diff), alpha = 0.05)

Ideally we want a lower epsilon to get a better approximation of the derivative. But why do the uncertainty intervals blow up?

dilsherdhillon · August 25, 2019, 11:09pm

I’m not a 100% sure why, but the solution is just to use posterior_linpred instead. I still don’t understand though, why the choice of epsilon would be sensitive to the sampling variance in posterior_predict.

Topic		Replies	Views
Fitted values transformation General rstan , posterior-predictive , brms	1	523	February 3, 2021
Derivative of Gaussian Process Modeling gaussian-process	17	3047	August 3, 2022
Differences between posterior_smooths vs posterior_predict vs posterior_epred Modeling brms	6	795	November 2, 2023
Difference for splines between brms and gamm4 brms	21	4217	April 23, 2019
Honey, I shrunk the random slopes (too much?) Modeling specification , hierarchical-model , brms	13	1089	September 8, 2023

Calculate the first derivative and it's posterior distribution of an estimated spline trend

Related topics