I’m making a multivariate stochastic volatility model with correlation on the returns but not the volatility processes. I’ve already reparameterized the volatility process parameter ‘h’, which dramatically improved the speed and mixing. I’m wondering if it’s possible to also reparameterize the sampling distribution for the ‘returns’ data variable. It’s my understanding that it should be possible to specify returns as being sampled from a standard normal rescaled by the volatility taken from ‘h’ and the Cholesky decomposition of the correlation matrix, but I’m not sure how to implement that. Any other tips for efficiency are appreciated as well. Thanks!
data {
int len;
int num;
vector[num] returns[len];
corr_matrix[num] corr;
}
transformed data {
cholesky_factor_corr[num] L;
L = cholesky_decompose(corr);
}
parameters {
vector[num] mu;
vector<lower=-1, upper=1>[num] phi;
vector<lower=0>[num] sigma;
vector[num] h_std[len];
}
transformed parameters {
vector[num] h[len];
h[1] = mu;
for (t in 2:len) {
h[t] = h_std[t] .* sigma;
h[t] += mu;
h[t] += phi .* (h[t-1] - mu);
}
}
model {
mu ~ normal(0, 25);
phi ~ uniform(-1, 1);
sigma ~ inv_gamma(2.5, .025);
for (t in 1:len) {
h_std[t] ~ std_normal();
returns[t] ~ multi_normal_cholesky(rep_vector(0, num), diag_pre_multiply(exp(h[t] / 2), L));
}
}
If I understand you correctly, you want to reparemtrize the returns[t] ~ multi_normal_cholesky(...)
statement? If so, this is probably not very useful - reparametrizations (changing what is in the parameters
block) are useful, because they change the geometry of the space the sampler is exploring. If you change the way you compute the likelihood (sampling statement), the induced geometry on the actual parameters should AFAIK stay identical (or you’ve changed the model) and hence little improvement is to be expected.
Other than that , some quick observations (without necessarily understanding your model deeply):
phi ~ uniform(-1, 1);
this is weird - is there some hard physical/mathematical constraint to force such sharp bounds on phi
? Sharp boundaries tend to hinder sampling unless they represent actual hard constraints. If the constraint is soft you could have phi ~ normal(0, 0.5);
, i.e. that 95% of values is within -1; +1.
In general it is good to avoid assigning to the same variable (if it depends on parameters) so a more efficient formula would be:
h[t] = h_std[t] .* sigma + mu + phi .* (h[t-1] - mu);
Also, maybe there is a non identifiability with phi
and mu
? My hunch is that when you flip the sign of mu
you can then change phi
(and possibly sme of h_std
) to get the exact same h
. I can’t make it work exactly, so you better check yourself, but if this is the case, it would result in a multimodal posterior, which is a big problem.
Finally h[1] = mu
means that h_std[1]
is never used. This may cause issues. You should have vector[num] h_std[len - 1];
Hope that helps and best of luck!