Performance of stochastic volatility model

In the middle of designing general SDE solvers, I wonder if anyone has experience with applying volatility models in practice. In particular, since the model samples for wiener process in every time step, the number of normal step increment parameters(h_t) reach \sim 10^3 easily when the modeled period is in the scale of year, assuming each step corresponds to a day. How does such a model fare in terms of performance? Does the performance meet practical expectation?

1 Like

I do. I usually use particle MCMC. You don’t have to hold onto all the state samples throughout time, so it’s relaricely memory-efficient, but it’s still generally slow and hard to tune. This approach also has problems with the large T situation, but for a different reason.

Thanks.

Care to elaborate?

Sure. Particle MCMC is similar to regular Meteopolis-Hastings targeting the marginal posterior (parameters but not states/volatilities). The difference is that the likelihood isn’t available, so it approximates the likelihood with an unbiased estimate from a particle filter. So at every iteration of the MCMC sampler, you run a particle filter through your time series data. Running through the data is done recursively, so at any given time you’re holding onto samples that approximate a “filtering distribution” at one time point.

The more accurate the likelihood estimates, the better the mixing. This is where the tuning is difficult. More particles is generally the way to do that, but that also means more computation time at each iteration. You can also be clever about what kind of particle filter you use, and this will make a difference. Finally, the variance of the log likelihood estimate is at best O(T), so for longer time periods it gets tough.

I’d be interested in knowing how well an approach like this fits within Stan because it’s decidedly not Hamiltonian-y. Not only do you not have gradients, you don’t even have likelihoods. If there’s any interest, I do have quite a bit of relatively battle-tested c++ code for this sort of thing.

Just a warning that there are a whole lot of techniques called “particle MCMC” ranging from sequential Monte Carlo (SMC) filtering type algorithms to ensemble methods like differential evolution. I’m not sure what kind of particle methods could proceed withot a likelihood. Do you have a reference?

For algorithms, Stan exposes the model class, which can provide log densities and derivatives. Given that Stan is based on writing log densities and providing derivatives, density-free or even derivative-free methods aren’t particularly attractive. The only dimensionally scalable MCMC algorithm (in the computational sense of complexity to draw a sample) is HMC and it requires derivatives.

The treatment for the lack of likelihood here is just as treatment in an ODE model: a sophisticated ODE doesn’t have close-form likelihood w.r.t its parameter, we simply resort to the approximate likelihood by numerical solution. Here given a sample path of the process, the approximated likelihood would be based on path data, numerical discretization, and the fact the driving process is Brownian.

Just a warning that there are a whole lot of techniques called “particle MCMC” ranging from sequential Monte Carlo (SMC) filtering type algorithms to ensemble methods like differential evolution. I’m not sure what kind of particle methods could proceed withot a likelihood. Do you have a reference?

I’m not familiar with differential evolution, but I am referring to using SMC/particle filtering algorithms within a Metropolis-Hastings algorithm. I wrote some slides from earlier in the summer here, and that’s got some references in it. Algorithmically it’s pretty simple: you just replace a “real” likelihood in the acceptance ratio with an estimate of it, and it’s still asymptotically exact in the same way regular MH is. Usually the big requirement is that the likelihood is nonnegative and unbiased.

For algorithms, Stan exposes the model class, which can provide log densities and derivatives. Given that Stan is based on writing log densities and providing derivatives, density-free or even derivative-free methods aren’t particularly attractive. The only dimensionally scalable MCMC algorithm (in the computational sense of complexity to draw a sample) is HMC and it requires derivatives.

Understood. I have a particle filtering library that asks the user to specify densities, and lets the user choose which particle filter he/she wants to subclass. Each base class provides a method to give you approximate log-likelihoods, but if scalability is a requirement, then yes, that might be a problem right now.

The treatment for the lack of likelihood here is just as treatment in an ODE model: a sophisticated ODE doesn’t have close-form likelihood w.r.t its parameter, we simply resort to the approximate likelihood by numerical solution. Here given a sample path of the process, the approximated likelihood would be based on path data, numerical discretization, and the fact the driving process is Brownian.

Do you have a reference I could take a look at? I’m not familiar with ODE models at all, really, but this sounds pretty interesting. So you do this approximation at every iteration of a MCMC algorithm?

The most important aspect of the method you are referencing is not that it uses particles but rather than it uses a pseudo-marginal updating scheme. To avoid confusion I recommend that you refer to the method as a psuedo-marginal scheme with particle-based proposals, which should avoid most of the confusion with other methods and make it easier for others to find relevant references.

1 Like

From now on I’ll use both instead of guessing on one :) However, in general, particle MCMC isn’t a subset of the pseudo-marginal approach.

I was abusing the term approximated likelihood while what I should say is “conditioning on numerical solution”, so that with ODE parameter \theta and observed data y_{\text{obs}}, we have p(y_{\text{obs}}|\theta) = p(y_{\text{obs}}|(y_{\text{num}}|\theta)). In a sense one can read numerical integrator as a complicated regressor, and most often its solution y_{\text{num}} serves as an unbiased mean, as the example in https://www.sciencedirect.com/science/article/pii/S030439750800501X

Also I should be more careful using “closed form” to indicate “analytical form”, since in general “closed form” indicates something similar to “tractable”.

Pretty cool, this looks like another example of the pseudo-marginal approach. It predates the paper that introduced it, so that explains why that name doesn’t pop up a lot.

Coming back to Stan, off the top of my head there are only a few approaches that handle this sort of model with intractable likelihoods. 1.) sample the joint posterior with the currently available options, 2.) implement some sort of pseudo-marginal thing perhaps that handles particle MCMC as a special case, and 3.) I’ve heard approximate Bayesian computation has been used for this sort of thing. Regarding 1.) where are we at with HMC being able to handle discrete targets?

I think @betanalpha was just saying the choice of approximate vs. full likelihood is independent of the choice of SMC.