Hello,
I am trying to implement local linear trend with regression components in stan (through rstan). This is documented under “Nowcasting” section of “BSTS blog” here - BTW, there is no seasonal component in my case - just trend and regression:
I already use BSTS package, and thought it will be good to be able to implement it in Stan. So, I wrote the basic model along with data simulation, both of which are attached here. It looks like slope and level variance are not well-identified in the model. So, will appreciate pointers on how to implement that equivalent model from BSTS package. I looked around this forum for non-centered, centered time series etc. but didn’t find anything helpful yet on how to go about fixing the problem.
Model:
data {
int <lower=0> T;
vector[T] x;
vector[T] y;
}
parameters {
vector[T] u; //Level
vector[T] v; //Slope
real beta;
real <lower=0> s_obs;
real <lower=0> s_slope;
real <lower=0> s_level;
}
model {
v[1] ~ normal(0, s_slope);
tail(v,T-1) ~ normal(head(v,T-1),s_slope);
u[1] ~ normal(0, s_level*10); //Add a large scale since time series start can be from anywhere in the stream
tail(u,T-1) ~ normal(head(u,T-1) + head(v,T-1),s_level);
y ~ normal (u + beta*x,s_obs);
}
Data Generation and Testing (rstan) - looks like s_slope and s_level are poorly identified, and need some kind of centering to avoid the funnel. I have included the pairs plot below.
library('rstan')
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
#Generate dummy data for model testing
# model: y[t] = u[t] + b*x[t] + s_obs , s_* is sigma of noise
# u[t+1] = u[t] + v[t] + s_level
# v[t+1] = v[1] + s_slope
beta <- rnorm(250,1,0.1)
v <- rep(-0.0001,250)
u <- rep(0.0,250)
s_slope <- rnorm(250,0,0.005)
s_level <- rnorm(250,0,0.005)
for (i in 2:250){
u[i] <- u[i-1] + v[i-1] + s_level[i-1]
v[i] <- v[i-1] + s_slope[i-1]
}
s_obs <- rnorm(250,0,1)
x <- runif(250,0,1)
y <- u + beta*x + s_obs
n <- 250
ytrain <- y[1:n]
xtrain <- x[1:n]
#Load stan model file, and fit to data
fit <- stan(file = 'test.stan', data = list( T = length(ytrain), y = ytrain, x = xtrain) , iter=500, chains=4, control = list(max_treedepth = 10))
# Do pairs plot of different parameters
pairs(fit,pars = c("s_slope","s_level","s_obs","beta"))
rhat for s_slope and s_level are around 1.09 and 1.62 in one simulation. Also, attached the pairs plot.
test.R (909 Bytes)
test.stan (502 Bytes)