Hello - I added time variable ‘year’ as a monotone predictor in my stan model. I am using brms package.
When doing posterior predictions, I use new data which has year as numeric variable. However, posterior predict does not accept new values of year as the model has year as monotone variable. Has anyone else faced this issue?
Thank you.
It is not possible to predict monotonic effects, sensu Bürkner & Charpentier 2020 to new ordinal levels outside the range of ordinal levels used to fit the model. Intuitively, you can see that this is the case because the model makes no assumption about the magnitude of the gaps between levels, only that the sign of the gaps must match the rest of the model to ensure monotonicity. So the likelihood contains no information abut the values of levels outside the range of ordinal levels used in monotonic fitting. This monotonic model does not permit meaningful extrapolation.
1 Like
Agree that the monotonic ordinal predictors in {brms}
cannot really extrapolate in the way that you’d like. However you may get the behaviour you are after by using a monotonic penalized smooth function of time
in the {mvgam}
package. This uses basis constructors from the {splines2}
package that allow for monotonically increasing or decreasing splines. They still might not extrapolate very well, but at least the extrapolations will respect the monotonicity constraints. Below is a simple reprex
library(mvgam)
library(mgcv)
library(marginaleffects)
library(ggplot2); theme_set(theme_classic())
# Simulate data from a monotonically increasing function
set.seed(99)
x <- runif(80) * 4 - 1
x <- sort(x)
f <- exp(4 * x) / (1 + exp(4 * x))
y <- f + rnorm(80) * 0.1
plot(x, y, pch = 16)

# Split into training and testing
mod_data <- data.frame(y = y, x = x) %>%
dplyr::arrange(x)
data_train <- mod_data[1:65, ]
data_test <- mod_data[66:80, ]
# A standard TRPS smooth from mgcv doesn't capture increasing monotonicity
mod <- gam(y ~ s(x, k = 24),
data = data_train,
family = gaussian())
gratia::draw(mod)

# And neither do the extrapolations
plot_predictions(mod,
by = 'x',
newdata = mod_data,
points = 0.5)

# Using the 'moi' (monotonically increasing) basis in mvgam rectifies this
# and will extrapolate appropriately
mod2 <- mvgam(y ~ s(x, bs = 'moi', k = 24),
data = data_train,
family = gaussian(),
chains = 2)
#> Compiling Stan program using cmdstanr
#>
#> Start sampling
#> Running MCMC with 2 parallel chains...
#>
#> Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
#> Chain 1 Iteration: 100 / 1000 [ 10%] (Warmup)
#> Chain 2 Iteration: 1 / 1000 [ 0%] (Warmup)
#> Chain 1 Iteration: 200 / 1000 [ 20%] (Warmup)
#> Chain 1 Iteration: 300 / 1000 [ 30%] (Warmup)
#> Chain 2 Iteration: 100 / 1000 [ 10%] (Warmup)
#> Chain 2 Iteration: 200 / 1000 [ 20%] (Warmup)
#> Chain 1 Iteration: 400 / 1000 [ 40%] (Warmup)
#> Chain 2 Iteration: 300 / 1000 [ 30%] (Warmup)
#> Chain 1 Iteration: 500 / 1000 [ 50%] (Warmup)
#> Chain 1 Iteration: 501 / 1000 [ 50%] (Sampling)
#> Chain 2 Iteration: 400 / 1000 [ 40%] (Warmup)
#> Chain 1 Iteration: 600 / 1000 [ 60%] (Sampling)
#> Chain 2 Iteration: 500 / 1000 [ 50%] (Warmup)
#> Chain 2 Iteration: 501 / 1000 [ 50%] (Sampling)
#> Chain 2 Iteration: 600 / 1000 [ 60%] (Sampling)
#> Chain 1 Iteration: 700 / 1000 [ 70%] (Sampling)
#> Chain 2 Iteration: 700 / 1000 [ 70%] (Sampling)
#> Chain 2 Iteration: 800 / 1000 [ 80%] (Sampling)
#> Chain 1 Iteration: 800 / 1000 [ 80%] (Sampling)
#> Chain 2 Iteration: 900 / 1000 [ 90%] (Sampling)
#> Chain 1 Iteration: 900 / 1000 [ 90%] (Sampling)
#> Chain 2 Iteration: 1000 / 1000 [100%] (Sampling)
#> Chain 2 finished in 1.3 seconds.
#> Chain 1 Iteration: 1000 / 1000 [100%] (Sampling)
#> Chain 1 finished in 1.6 seconds.
#>
#> Both chains finished successfully.
#> Mean chain execution time: 1.4 seconds.
#> Total execution time: 1.9 seconds.
gratia::draw(mod2)

plot_predictions(mod2,
by = 'x',
newdata = mod_data,
points = 0.5)

Created on 2024-11-01 with reprex v2.0.2
6 Likes