Fast Gaussian Processes: kernel interpolation + Kronecker utilisation

JulianK · August 29, 2017, 9:38am

Has anyone thought about implementing the methods described in Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP) or Thoughts on Massively Scalable Gaussian Processes?

These papers describe a methodology which reduces the O(n^3) learning complexity to O(n), and makes the prediction complexity O(1) - with potential scaling to millions or billions of points.

Basically it requires implementation of Kronecker matrix algebra including fast matrix-vector multiplies.

Looks really impressive…

mike-lawrence · August 29, 2017, 2:28pm

Not a full implementation of KISS-GP nor MSGP, but Rob Trangucci’s has example implementations of the Kronecker trick here.

avehtari · August 29, 2017, 9:24pm

I’ve only thought about someone else implementing them. Pity they didn’t implement these directly in Stan.
More seriously: Having these would be great, but a big project to implement in Stan (bigger than writing a demo code demonstrating that the methods work). There are plans to add simpler speed-ups first.

There are however limitations, which they also mention at some point in the papers (it would so much nicer to mention the limitations in the first page). 1) Kronecker sets strict limitations on the covariance functions to be used making it less generic. 2) The number latent values grows exponentially with number of dimensions, and the say that it’s unlikely to work beyond 5 dimensions, and if I remember correctly they have demonstrated it only in 2D (or maybe 3D in another paper). For higher dimensions this needs some dimension reduction. 3) They don’t do full Bayesian inference and in case of non-Gaussian likelihoods they use Laplace approximation to integrate over the latent values. For Stan billion latent values would be a problem.

In addition of these basics (which Stan would hopefully have at some point) it’s bit more complicated to make it fast for Stan, but definitely it would require these first (which would be helpful in other cases, too).

JulianK · August 29, 2017, 10:42pm

What speedups are you referring to here?

The second paper discusses your point #2 here - using a form of dimensional reduction to deal with this issue.

avehtari · August 31, 2017, 8:20pm

Note that since this community project, there is no specific road map, but based on public pull requests, issues, and discourse forum discussions, my guess is that before KISS-GP we’ll see some of these: GPUs, MPI, more internal covariance functions, Kroenecker, MRFs, basis functions for 1D, NNGP, unless someone volunteers to implement KISS-GP.

Yes, that’s why I mentioned it, too as it will limit it’s use as a generic approach (the limitations due to dimensionality reduction may be not so clearly stated in the paper). Of course there are also many cases where data is in a such lower dimensional manifold that it is relatively easy and sensible thing to do.

Topic		Replies	Views
Scalable Gaussian process inference Publicity gaussian-process	27	1917	March 25, 2024
Gaussian process regression General	40	2971	September 19, 2021
Sparse approximation for gaussian process Modeling	1	856	July 24, 2020
Grid based gaussian process speedup Modeling gaussian-process	7	1011	March 8, 2023
Gaussian Process with Binomial Likelihood General gaussian-process	6	1192	January 20, 2022

Fast Gaussian Processes: kernel interpolation + Kronecker utilisation

Related Topics