Hi everyone,
I run an error correction model in Stan and try to estimate the long term effects of my regressors.
To check whether my model is right I use the ecm package in R and the example from the package’s reference manual.
library(ecm)
library(rstan)
library(tidyverse)
data(Wilshire)
trn <- Wilshire[Wilshire$date<='2015-12-01',]
#Assume all predictors are needed in the equilibrium and transient terms of ecm
xeq <- xtr <- trn[c('CorpProfits', 'FedFundsRate', 'UnempRate')]
model1 <- ecm(trn$Wilshire5000, xeq, xtr, includeIntercept=TRUE)
summary(model1)
Call:
lm(formula = dy ~ ., data = x, weights = weights)
Residuals:
Min 1Q Median 3Q Max
-6.9223 -0.6088 0.0210 0.6822 3.9381
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.4160213 0.8236705 -0.505 0.614155
deltaCorpProfits 0.0119853 0.0020332 5.895 0.0000000197 ***
deltaFedFundsRate -0.1231619 0.1547487 -0.796 0.427210
deltaUnempRate -1.4841457 0.4389805 -3.381 0.000896 ***
CorpProfitsLag1 0.0027077 0.0008258 3.279 0.001265 **
FedFundsRateLag1 0.0655636 0.0494706 1.325 0.186849
UnempRateLag1 -0.0532751 0.1040916 -0.512 0.609448
yLag1 -0.0337028 0.0192679 -1.749 0.082066 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.711 on 170 degrees of freedom
Multiple R-squared: 0.2919, Adjusted R-squared: 0.2628
F-statistic: 10.01 on 7 and 170 DF, p-value: 0.0000000001885
ecm() transforms the ECM in a way, that the model can be estimated in one regression:
with {\gamma _{l,long - term}} = - {{\beta _{l,long - term}}\gamma}
I can run this model easily in Stan and recover the same parameters like in the example:
data <- data.frame(y=trn$Wilshire5000,
trn %>%
dplyr::select(CorpProfits,
FedFundsRate,
UnempRate)) %>%
mutate_all(list(lag=~dplyr::lag(.),
fd=~.-dplyr::lag(.))) %>%
.[complete.cases(.),]%>%
dplyr::select(-CorpProfits,
-FedFundsRate,
-UnempRate,
-y)
# Stan data
standata = list(N = NROW(data$y_fd),
K = ncol(data %>%
dplyr::select(ends_with("fd"),
-y_fd)),#number of coefficients for short term effects
L = ncol(data%>%
dplyr::select(ends_with("lag"),
-y_lag)),#number of coefficients for long term effects
Y_fd = as.numeric(data$y_fd), #differenced outcome
Y_lag = as.numeric(data$y_lag),#lagged outcome
X = data %>%
dplyr::select(ends_with("fd"),
-y_fd),#matrix of differenced predictors
Z = data %>%
dplyr::select(ends_with("lag"),
-y_lag)##matrix of laged predictors
)
# Stan
stan_code_ecm="data {
int<lower=0> N; //number of observations
int<lower=0> K; //number of coefficients for short term effects
int<lower=0> L; //number of coefficients for long term effects
matrix[N, K] X; //matrix of differenced predictors
matrix[N, L] Z; //matrix of lagged predictors
vector[N] Y_fd; //differenced outcome
vector[N] Y_lag; //lagged outcome
}
parameters {
real alpha_st;
vector [K] beta_st;
vector [L] gamma_lt;
real gamma;
real<lower=0> sigma;
}
model {
//priors
alpha_st ~ student_t(4,0,10);
beta_st ~ student_t(4,0,10);
gamma ~ student_t(4,0,10);
gamma_lt ~ student_t(4,0,10);
sigma ~ student_t(4,0,10);
//likelihood
Y_fd ~ normal(alpha_st + X*beta_st + gamma*Y_lag + Z*gamma_lt, sigma);
}"
fit_ecm = stan(model_code = stan_code_ecm,
data = standata,
chains = 4,
iter = 2000,
cores = 4)
print(fit_ecm, digits=3)
Inference for Stan model: 72ebbdea57dae7e762831d6c92d0a799.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha_st -0.441 0.021 0.827 -2.063 -0.996 -0.432 0.122 1.175 1613 1.001
beta_st[1] 0.012 0.000 0.002 0.008 0.011 0.012 0.013 0.016 5832 1.000
beta_st[2] -0.126 0.003 0.158 -0.432 -0.232 -0.128 -0.019 0.179 2031 1.002
beta_st[3] -1.492 0.010 0.448 -2.357 -1.791 -1.485 -1.195 -0.625 1999 1.001
gamma_lt[1] 0.003 0.000 0.001 0.001 0.002 0.003 0.003 0.004 2204 1.000
gamma_lt[2] 0.066 0.001 0.048 -0.029 0.034 0.066 0.100 0.159 2362 1.000
gamma_lt[3] -0.050 0.003 0.105 -0.252 -0.123 -0.051 0.021 0.153 1670 1.001
gamma -0.033 0.000 0.019 -0.070 -0.046 -0.033 -0.021 0.006 1894 1.000
sigma 1.724 0.002 0.091 1.553 1.661 1.721 1.783 1.911 2813 1.000
lp__ -184.642 0.054 2.159 -189.825 -185.819 -184.270 -183.044 -181.532 1594 1.001
Samples were drawn using NUTS(diag_e) at Wed Dec 11 20:49:52 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
But, since {\gamma _{l,long - term}} is the product of - {\beta _{l,long - term}} and \gamma, I am also interested in {\beta _{l,long - term}} and its CI.
I already tried some things like:
(1) Directly defining the likelihood as the untransformed model:
model {
...
//likelihood
Y_fd ~ normal(alpha_st + X*beta_st + gamma*(Y_lag - Z*beta_lt), sigma);
}
This gives me weird results and diverging transitions and high Rhat/low n_eff.
Warning messages:
1: There were 138 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: Examine the pairs() plot to diagnose sampling problems
3: The largest R-hat is 1.54, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
Inference for Stan model: ebeaa5664f6c938ba1ee5875fa240a00.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha_st -0.520 0.015 0.515 -1.707 -0.825 -0.441 -0.172 0.300 1156 1.014
beta_st[1] 0.012 0.000 0.002 0.008 0.010 0.012 0.013 0.016 5019 1.000
beta_st[2] -0.086 0.003 0.154 -0.380 -0.191 -0.087 0.019 0.217 2259 1.000
beta_st[3] -1.368 0.010 0.441 -2.212 -1.665 -1.361 -1.079 -0.485 2032 1.001
beta_lt[1] 0.067 0.380 0.705 -2.025 0.033 0.175 0.363 1.319 3 1.557
beta_lt[2] 3.256 4.267 11.698 -22.506 -1.332 4.060 9.095 25.032 8 1.184
beta_lt[3] 0.437 0.292 10.265 -20.389 -4.819 0.236 5.551 22.165 1238 1.015
gamma -0.005 0.003 0.008 -0.024 -0.009 -0.004 0.000 0.006 5 1.282
sigma 1.733 0.002 0.095 1.559 1.668 1.727 1.794 1.934 2474 1.001
lp__ -186.473 0.415 2.477 -192.042 -187.865 -186.213 -184.748 -182.358 36 1.056
Samples were drawn using NUTS(diag_e) at Wed Dec 11 22:34:39 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
(2) Calculating {\beta _{l,long - term}} as generated quantities.
generated quantities {
vector [L] beta_lt;
beta_lt=gamma_lt/gamma;
}
This gives me also weird results with very broad CI
Inference for Stan model: a99af2ac032d43240be78c06c0fe10b0.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha_st -0.396 0.022 0.834 -2.057 -0.958 -0.399 0.160 1.231 1402 1.002
beta_st[1] 0.012 0.000 0.002 0.008 0.011 0.012 0.013 0.016 6873 1.000
beta_st[2] -0.127 0.003 0.156 -0.429 -0.235 -0.125 -0.019 0.175 2086 1.002
beta_st[3] -1.497 0.009 0.441 -2.333 -1.805 -1.508 -1.202 -0.615 2665 1.001
gamma_lt[1] 0.003 0.000 0.001 0.001 0.002 0.003 0.003 0.004 2485 1.000
gamma_lt[2] 0.065 0.001 0.050 -0.032 0.031 0.064 0.098 0.163 2109 1.001
gamma_lt[3] -0.055 0.003 0.104 -0.258 -0.125 -0.055 0.015 0.149 1554 1.002
gamma -0.034 0.000 0.019 -0.072 -0.047 -0.034 -0.021 0.003 1944 1.000
sigma 1.723 0.002 0.095 1.551 1.655 1.716 1.785 1.919 2678 1.000
beta_lt[1] -0.651 0.375 23.776 -0.353 -0.099 -0.078 -0.065 0.083 4010 1.000
beta_lt[2] -38.847 25.799 1633.578 -19.133 -3.424 -1.724 -0.675 7.102 4009 1.000
beta_lt[3] -14.000 10.351 645.256 -14.341 -0.354 1.708 3.398 12.546 3886 1.000
lp__ -184.625 0.054 2.172 -189.815 -185.794 -184.295 -183.081 -181.407 1630 1.002
Samples were drawn using NUTS(diag_e) at Wed Dec 11 22:24:07 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Now I wonder what the best way is to estimate {\beta _{l,long - term}} and also get CI for these parameters.
Thank you very much in advance!