Advice/Peer Review: Piecewise Linear Model w/ Random Change Point and Discontinuity

Operating System: macOS 10.14.6
brms version: 2.10.0

Hi all,

This popular post on piecewise linear models with random change points inspired me to try it out, but I’m having a little trouble fitting my model. In particular, I’d like to extend the model to include a discontinuity at the change point. Ultimately, I want to fit the following:

Y_{i} \sim \beta_{0} + \beta_{1}(x_{i} - \omega)I(x_{i} < \omega) + \beta_{2}(x_{i} - \omega)I(x_{i} \geq \omega) + \beta_{3}I(x_{i} \geq \omega)

Where \beta_{0} is the level of Y at the change point \omega, \beta_{1} the slope before the change point, \beta_{2} the slope after the change point, and \beta_{3} the discontinuity at the change point. I(.) gives a value of 1 where its contents are true and 0 otherwise.

As you’ll see below, it doesn’t quite work yet. My suspicion is that the model isn’t identified. I can get the model to fit without the discontinuity, so that’s probably where the trouble is. But I don’t know why.

If anyone has any advice, I’d love to hear it.

Example: Below is code to reproduce this assuming a Gaussian outcome.

# Load packages
library(tidyverse)
library(brms)

# Create function to simulate data
sim_chgpt <- function(n = 100, b0 = 10, b1 = 0.5, b2 = 0, b3 = 30, om = 100){

# Create running variable for each observation
x <- runif(n, 0, 200)

# Add noise to omega parameter
om <- om + rnorm(n)

# Create outcome
y <-
  b0 +
  ( b1 * ( x - om ) * ( x < om ) ) +
  ( b2 * ( x - om ) * ( x >= om ) ) +
  ( b3 * ( x >= om ) ) +
  rnorm(n, 0, 5)

# Return tibble
tibble(x, y)
}

# Check graphically
sim_chgpt() %>% 
ggplot(aes(x = x, y = y)) +
geom_point()

# Fit model
m1 <-
  brm(
    formula = 
      bf(y ~
           b0 +
           b1 * ( x - omega ) * ( x < omega ) +
           b2 * ( x - omega ) * ( x >= omega ) +
           b3 * ( x >= omega ),
         b0 + b1 + b2 + b3 + omega ~ 1,
         nl = TRUE),
    family = gaussian(),
    prior = 
      prior(normal(0, 5), nlpar = "b0") +
      prior(normal(0, 5), nlpar = "b1") +
      prior(normal(0, 5), nlpar = "b2") +
      prior(normal(0, 20), nlpar = "b3") +
      prior(normal(100, 5), nlpar = "omega", class = "b"),
    data = sim_chgpt(n = 1000),
    iter = 2e3,
    refresh = 5,
    chains = 2,
    cores = 2
  )

This gives the following warnings:

Warning messages:
1: There were 1931 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
2: Examine the pairs() plot to diagnose sampling problems

3: The largest R-hat is 1.91, indicating chains have not mixed.
Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#r-hat 
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#bulk-ess 
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#tail-ess 

And the following output:

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
b0_Intercept        2.00      8.20   -14.44    10.22 1.91        3       52
b1_Intercept       -0.14      3.86   -10.01     7.88 1.85       95       70
b2_Intercept        0.24      0.23    -0.00     0.49 1.83        3       61
b3_Intercept        0.23     30.04   -38.11    30.87 1.84        3       94
omega_Intercept    49.99     49.88    -0.53   100.01 1.83        3       54

Clearly something is wrong. The estimates are off, the Rhats are high, and the bulk and tail ESSs are low.

1 Like

The discontinuity is IMHO indeed the problem: note that \beta_1(x - \omega) I(x < \omega) produces a likelihood continuous wrt. \omega, but \beta_3(x < \omega) does not. And Stan does not like non-continous likelihoods. To make this work, I would suggest you use inv_logit or similar to get a continous analog of the step function. This IMHO usually also makes more sense theoretically - I don’t think there are many processes where there is actual instantaneous change.

Hope that helps!

1 Like

Thanks for helping! It’s definitely the discontinuity: I fixed it to be positive and the problem went away (though there were a lot of max_treedepth problems remaining).

Just so I’m clear, do you mean the following?:

formula = 
      bf(y ~
           b0 +
           b1 * ( x - omega ) * ( x < omega ) +
           b2 * ( x - omega ) * ( x >= omega ) +
           b3 * inv_logit( x >= omega ),
         b0 + b1 + b2 + b3 + omega ~ 1,
         nl = TRUE)

Or something else?

I think inv_logit( x >= omega ) won’t work as it is also discontinous, what I meant was inv_logit( x - omega ), which is continuous…

Also not sure how fixing b3 to be positive would resolve your issues, so I consider it likely that the problem is sitll there (as witnessed by the max treedepth warnings).

Best of luck with your model!

This seems to work, thanks!