I am trying to estimate the following stochastic volatility model for time series stock price data.
y[t] = mu2 + exp(h[t]/2)*ε(1).
h[t] = mu1 +φ(h[t-1]+mu1) + σ*ρ*(y[t-1]-mu2)/exp(h[-1]/2)+σ*\sqrt{1-ρ^2}*ε(2).
ε(1,2)~N(0,1).
y is vector of stock return, h is vector of hidden volatility.
data {
int<lower=0> T; // # time points (equally spaced)
vector[T] y; // mean corrected return at time t
int<lower=0> N; // # outsample time points (equally spaced)
vector[N] y_new; // mean corrected return at time t
}
parameters {
real mu1; // mean log volatility
real mu2; // mean index return
real phi; // persistence of volatility
real<lower=0> sigma; // white noise shock scale
vector[T] h; // log volatility at time t
real rho; // leverage
}
model {
phi ~ uniform(0, 1);
sigma ~ cauchy(0, 2);
mu1 ~ cauchy(-9, 5);
mu2 ~ normal(0, 0.1);
rho ~ uniform(-1, 1);
h[1] ~ normal(mu1, sigma / sqrt(1 - phi * phi));
for (t in 1:T)
y[t] ~ normal(mu2, exp(h[t] / 2));
for (t in 2:T)
h[t] ~ normal(mu1 + phi * (h[t - 1] - mu1) + rho * sigma * exp(-h[t-1] / 2) * (y[t-1] - mu2), sqrt(1 - rho * rho) * sigma);
}
generated quantities {
vector[N+1] h_new;
h_new[1] = normal_rng(mu1 + phi * (h[T] - mu1) + rho * sigma * exp(-h[T] / 2) * (y[T] - mu2), sqrt(1 - rho * rho) * sigma);
for (i in 2:N+1)
h_new[i] = normal_rng(mu1 + phi * (h_new[i - 1] - mu1) + rho * sigma * exp(-mu1 / 2) * (y_new[i-1] - mu2), sqrt(1-rho * rho) * sigma);
}
I first estimate the parameters and h using a sample of 500 days in the past, and then use the parameters to create h_new from the realized return y_new of the next 50 days.
Then, by repeating this over and over again, I try to estimate for ten years.
But the problem is that with each repetition, the amount of time required increases rapidly. Specifically, the first estimate and generation of h_new took 5 minutes, but by the sixth iteration, it was taking 15 minutes.
The stan code has been compiled once for the first time, and the following fit is the only thing we’re running many times in the python loop. Then, I put the average of the estimated h_new in the pre-prepared data frame.
for i in range(100):
fit = sm.sampling(data=data_dat, iter=10000, chains=4,thin=2)
data = data.append(mean)
Is looping stan’s fit when giving predictions while changing windows a bad way to do it?
Is there a way to streamline the calculation time? (A means of ensuring that each iteration is at least the same amount of time.)