Operating System: macOS 10.14.6
brms version: 2.10.0
Hi all,
This popular post on piecewise linear models with random change points inspired me to try it out, but I’m having a little trouble fitting my model. In particular, I’d like to extend the model to include a discontinuity at the change point. Ultimately, I want to fit the following:
Y_{i} \sim \beta_{0} + \beta_{1}(x_{i} - \omega)I(x_{i} < \omega) + \beta_{2}(x_{i} - \omega)I(x_{i} \geq \omega) + \beta_{3}I(x_{i} \geq \omega)
Where \beta_{0} is the level of Y at the change point \omega, \beta_{1} the slope before the change point, \beta_{2} the slope after the change point, and \beta_{3} the discontinuity at the change point. I(.) gives a value of 1 where its contents are true and 0 otherwise.
As you’ll see below, it doesn’t quite work yet. My suspicion is that the model isn’t identified. I can get the model to fit without the discontinuity, so that’s probably where the trouble is. But I don’t know why.
If anyone has any advice, I’d love to hear it.
Example: Below is code to reproduce this assuming a Gaussian outcome.
# Load packages
library(tidyverse)
library(brms)
# Create function to simulate data
sim_chgpt <- function(n = 100, b0 = 10, b1 = 0.5, b2 = 0, b3 = 30, om = 100){
# Create running variable for each observation
x <- runif(n, 0, 200)
# Add noise to omega parameter
om <- om + rnorm(n)
# Create outcome
y <-
b0 +
( b1 * ( x - om ) * ( x < om ) ) +
( b2 * ( x - om ) * ( x >= om ) ) +
( b3 * ( x >= om ) ) +
rnorm(n, 0, 5)
# Return tibble
tibble(x, y)
}
# Check graphically
sim_chgpt() %>%
ggplot(aes(x = x, y = y)) +
geom_point()
# Fit model
m1 <-
brm(
formula =
bf(y ~
b0 +
b1 * ( x - omega ) * ( x < omega ) +
b2 * ( x - omega ) * ( x >= omega ) +
b3 * ( x >= omega ),
b0 + b1 + b2 + b3 + omega ~ 1,
nl = TRUE),
family = gaussian(),
prior =
prior(normal(0, 5), nlpar = "b0") +
prior(normal(0, 5), nlpar = "b1") +
prior(normal(0, 5), nlpar = "b2") +
prior(normal(0, 20), nlpar = "b3") +
prior(normal(100, 5), nlpar = "omega", class = "b"),
data = sim_chgpt(n = 1000),
iter = 2e3,
refresh = 5,
chains = 2,
cores = 2
)
This gives the following warnings:
Warning messages:
1: There were 1931 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
2: Examine the pairs() plot to diagnose sampling problems
3: The largest R-hat is 1.91, indicating chains have not mixed.
Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#r-hat
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#bulk-ess
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See http://mc-stan.org/misc/warnings.html#tail-ess
And the following output:
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
b0_Intercept 2.00 8.20 -14.44 10.22 1.91 3 52
b1_Intercept -0.14 3.86 -10.01 7.88 1.85 95 70
b2_Intercept 0.24 0.23 -0.00 0.49 1.83 3 61
b3_Intercept 0.23 30.04 -38.11 30.87 1.84 3 94
omega_Intercept 49.99 49.88 -0.53 100.01 1.83 3 54
Clearly something is wrong. The estimates are off, the Rhats are high, and the bulk and tail ESSs are low.