Gaussian process roadmap

avehtari · August 16, 2018, 12:38pm

My personal opinion about use of GPs in Stan

Small to moderate size data (where uncertainty quantification is important).
Non-linear models with implicit interactions (See, e.g. leukemia example in BDA3. These are difficult or infeasible to do with splines etc.).
Hierarchical non-linear models and non-linear latent functions as modular part of bigger models (these limit which speed-up approximations can be used)
GAMs with GPs (easier to set priors than for splines)
Flexibility by allowing user defined covariance functions written as Stan functions (functor approach for building covariance matrix)
Laplace method for integrating over the latent values (this will make inference much faster, but it’s applicable only for restricted set of models)

Especially

We shouldn’t compete with specialized GP software that scale to bigger data (with less worry about uncertainty in covariance function parameters), but have restriction in model structure. That means we are not in a hurry to implement some complicated speed-up approximations which can be used only for very restricted models.
For many spatial and spatio-temporal data it’s better to use Markov models with sparse precision matrices as discussed in “Sparse Matrices for Stan” document (type of models commonly used with INLA software)
Wishlist for dense matrices from “Sparse Matrices for Stan” document lists some things which are already in progress
- Parallel (GPU and/or MPI) dense linear algebra
- General matrix algebra (sum, product, kronecker, matrix_divide, determinant) with reverse-mode derivatives
- Cholesky and derivative on GPU and general parallel
- See Stan manual Section 42.2 (Matrix arithmetic operations), 42.13 (Linear Algebra Functions and
  Solvers)
- Maybe also 42.4 (Elementwise Functions), 42.5 (Dot Products and Specialized Products)
The internal covariance matrix functions @anon79882417 is working on will make some non-linear models with implicit interactions, hierarchical non-linear models and non-linear latent functions as modular part of bigger models faster.
The Laplace method Charles is working in will make inference for combination of GP prior on latent function and log-concave likelihood faster.
A basis function representation of GPs Michael and Gabriel are working on will make GAMs with GPs, and 1D GPs as part of bigger models faster.
The covariance matrix approach (cov_exp_quad etc) is very limited in flexibility and turns out to use also lot of memory if many covariance matrices are combined- That’s why the wishlist has “Functor for GP specification”. “Sparse Matrices for Stan” lists
- Example would be f ~ GP(function cov_function, tuple params_tuple, vector mean, matrix locations ) for the centred parameterisation.
- For the non-centred parameterisation, you would need the back-transform f = GP_non_centered_transform(vector non_centered_variable, function cov_function, tuple params_tuple, vector mean, matrix locations ).
- A GP_predict() function that takes the appropriate arguments and produces predictions.
- Implemention would populate the matrix, do the Cholesky and compute the appropriate quantities
- See also Question about autodiff for potential GP covfun implementation

Note that my wishlist doesn’t have yet such speedup approximations as inducing point approaches, as they are quite complicated to implement and use.

Topic		Replies	Views
Adding Gaussian Process Covariance Functions Developers gaussian-process	81	7841	May 21, 2018
Making big kernels work General	34	2040	August 10, 2018
Could you please help to find out why my GP model got slower Modeling gaussian-process	12	1236	April 7, 2021
Stan on GPU: looking for model+dataset examples for empirical evaluation of speedups General	36	3449	March 5, 2018
Speeding up hierarchical models Developers	26	2522	June 5, 2020

Gaussian process roadmap

Related topics