I’m posting this in case anyone finds it useful…
I was getting some funky results with Stan until I eventually simplified my problem to the point that I realized my issue was with linear regression with highly correlated covariates.
In particular, I was trying to fit an error correction model (ECM), which is a time series model that extends a vector autoregression (VAR) in differences to account for long-term trends. My issue was in the long-term trend part of the ECM. Basically, I was regressing a variable against two variables that were highly correlated (over 99%). When I checked out the pairs plot, this became very obvious. Part of the reason they were so highly correlated was because an ECM model requires regressing time series variables in levels (both were I(1)) and the covariates shared a common trend.
rstan recommending increasing the tree depth and while that allowed the model to be fit without giving a warning, the effective sample size was still quite low.
What did I do instead? Once I realized the problem, all I needed to do was de-trend covariates. This sent the correlation between them to around 30% and Stan didn’t have any issue with it (much faster execution and higher effective sample size).
The Stan documentation mentions a few techniques that can improve Stan’s efficiency, like standardizing variables, but de-trending can also be an important tool in some cases.
See below for accompanying code.
library(rstan)
library(MASS)
rstan_options(auto_write = TRUE)
# Model
stan_model <- "
data {
int<lower=0> T;
int<lower=0> N_X;
vector[T] Y;
matrix[T, N_X] X;
}
parameters {
vector[N_X] B;
real<lower=0> sigma;
}
model {
B ~ normal(0, 100);
sigma ~ cauchy(0, 1);
Y ~ normal(X * B, sigma);
}
"
# Stan parameters
N_iter <- 500
N_chain <- 4
# Simulation Parameters
T <- 285
N_X <- 3
Y_coefficients <- c(-39.512948, 1.833247, 2.219502)
sd_Y_residual <- 0.05590053
X_initial <- c(10.9678, 11.00594)
mu_diff_X <- c(0.003503189, 0.003508181)
sigma_diff_X <- matrix(c(2.801688E-5, 1.243823E-5, 1.243823E-5, 1.340090E-5), 2, 2)
# Simulate
X_sim <- mvrnorm(T - 1, mu_diff_X, sigma_diff_X)
X_sim <- rbind(X_initial, X_sim)
X_sim <- apply(X_sim, 2, cumsum)
X_sim <- cbind(rep(1, T), X_sim)
Y_sim <- X_sim %*% Y_coefficients + rnorm(T, 0, sd_Y_residual)
reg1 <- lm(Y_sim ~ X_sim[, 2] + X_sim[, 3])
stan_data1 <- list(T=T, N_X=N_X, Y=as.vector(Y_sim), X=X_sim)
fit_1 <- stan(model_code = stan_model,
data = stan_data1, verbose = FALSE,
iter=N_iter, chain=N_chain)
#gets warning about number of transitions
print(fit_1)
# De-trend Simulation
X_sim_alt <- X_sim
reg_detrend1 <- lm(X_sim_alt[, 2] ~ seq(1, T))
reg_detrend2 <- lm(X_sim_alt[, 3] ~ seq(1, T))
X_sim_alt[, 2] <- reg_detrend1$residuals
X_sim_alt[, 3] <- reg_detrend2$residuals
Y_sim_alt <- X_sim_alt %*% Y_coefficients + rnorm(T, 0, sd_Y_residual)
stan_data2 <- list(T=T, N_X=N_X, Y=as.vector(Y_sim_alt), X=X_sim_alt)
reg2 <- lm(Y_sim_alt ~ X_sim_alt[, 2] + X_sim_alt[, 3])
fit_2 <- stan(model_code = stan_model,
data = stan_data2, verbose = FALSE,
iter=N_iter, chain=N_chain)
print(fit_2)