Hi, I was trying to fit the example SVM model 2.5 Stochastic volatility models | Stan User’s Guide
// SVM.stan
data {
int<lower=0> T; // # time points (equally spaced)
vector[T] y; // mean corrected return at time t
}
parameters {
real mu; // mean log volatility
real<lower=-1,upper=1> phi; // persistence of volatility
real<lower=0> sigma; // white noise shock scale
vector[T] h_std; // std log volatility time t
}
transformed parameters {
vector[T] h = h_std * sigma; // now h ~ normal(0, sigma)
h[1] /= sqrt(1 - phi * phi); // rescale h[1]
h += mu;
for (t in 2:T)
h[t] += phi * (h[t-1] - mu);
}
model {
phi ~ uniform(-1, 1);
sigma ~ cauchy(0, 5);
mu ~ cauchy(0, 10);
h_std ~ std_normal();
for (t in 1:T)
y[t] ~ normal(0, exp(h[t] / 2));
}
generated quantities { // this part is mine
vector[T] log_lik;
for (t in 1:T)
log_lik[t] = normal_lpdf(y[t] | 0, exp(h[t] / 2));
}
to some example financial data:
library(quantmod)
library(cmdstanr)
getSymbols("AAPL") # load the AAPLT price
y <- diff(log(as.vector(AAPL$AAPL.Close)))*100; y[1] <- 0 # get the returns in "percentage"
SVM <- cmdstan_model("SVM.stan") # load the example model
svm <- SVM$sample(data = list(T = length(y), y = y), chains=4, parallel_chains = 4)
In general, with either this data or other similar, the loo results are poor:
svm$loo()
Computed from 4000 by 3713 log-likelihood matrix
Estimate SE
elpd_loo -7235.1 59.4
p_loo 360.6 15.7
looic 14470.2 118.8
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 3431 92.4% 453
(0.5, 0.7] (ok) 201 5.4% 103
(0.7, 1] (bad) 73 2.0% 20
(1, Inf) (very bad) 8 0.2% 5
See help('pareto-k-diagnostic') for details.
Warning message:
Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
It must be said that the equivalent GARCH models do just fine and show no problems with loo on the same data, or any other similar data.
Maybe I am calculating the posterior predictive log likelihood wrong in the generated quantities block?
thank you