GP Basis Function Approximations for Product Kernels

odaniel1 · August 18, 2020, 5:41pm

I’ve recently been working through the paper Practical Hilbert space approximate Bayesian Gaussian
processes for probabilistic programming, and the accompanying github repo.

Comparing the implementation of the birthday problem to that presented in BDA3 I noticed that this implementation excludes the weekly quasi-periodic term included in BDA3: which is the product of a square-exponential and periodic kernel.

I understand that this kernel is not directly amenable to the basis function approach as it is non-stationary, but would like to check my understanding of why we cannot combine the BF approximation for the SE kernel with the stochastic resonator approximation to the periodic kernel (described in Appendix B).

My current understanding is that the efficiency gain in both approximations is that both enable us to write the GP in the latent variable form

f = \Phi \Delta^{1/2}\beta

(adopting the notation from the paper, p.6-7) where only \Delta depends on the GP parameters - so in particular for each sampled set of parameters we do a single matrix multiplication.

On the other hand - the usual latent variable implementation of a GP involves repeatedly performing a Cholesky decomposition for each draw of the parameters - which is more computationally expensive than the matrix multiplication.

Is this interpretation of the efficiency boost correct?

Bringing this back to the product of kernels - my understanding is that we cannot identify the spectral decomposition of the product from that of the two kernels - meaning that even if we worked with the approximations we would still be forced to multiply and then perform a Cholesky decomposition, hence losing any efficiency gain?

Hopefully that’s clear - and would appreciate views on whether this is an accurate heuristic of the method.

Thanks!

avehtari · August 22, 2020, 7:49am

Quasi-periodic was not used as it could require quite many basis functions (if the range is the whole time series) or the same basis functions but different coefficients for different weeks and time series model for those coefficients. With the latter approach the number of coeffiecients would the number of weeks times times the number of basis functions. Of course you use basis function approximation to model the the time series of of the coefficients, but it’s more complicated to make a multivariate time series model.

Non-stationarity is not a problem for basis functions. A linear model is non-stationary and also a perfect basis function presentation of GP with dot-product covariance function.

I’m not aware of how we could do (general) quasi-periodic time series GP model case efficiently with basis functions. I do now it’s possible to present them as state-space-models to get linear scaling with data Explicit Link Between Periodic Covariance Functions and State Space Models, but the state-space-models need also efficient implementations and especially they would require gradient computations written also in C++ as just autodiffing through Kalman filter is very slow. I’m not aware of such implementation at this time, but it’s something that would be nice to have.

odaniel1 · August 28, 2020, 7:03am

@avehtari - Thanks for this, its really helpful: previously I hadn’t realised your point around the need for (no. weeks) x (no. BF) variables which I can see would become quite impractical.

With stationarity - my understanding when reading the paper was that our ability to construct the approximation is conditional on being able to use Bochner’s Theorem, for which stationarity is a neccessary condition.

I think I will need to revisit the paper to properly understand this.

Thanks again for your thoughtful comments.

avehtari · August 28, 2020, 8:31am

Sorry for the confusion. The method for generating basis functions in the paper assumes stationarity, but there are many ways to generate basis functions and I commented that in general it is possible to have basis functions that are non-stationary.

Topic		Replies	Views
Approximate GPs with Spectral Stuff Modeling gaussian-process	57	9469	August 5, 2020
How to make GP with non-Gaussian likelihood more efficient? Modeling gaussian-process	21	2260	February 25, 2020
Efficient Gaussian Process \| The Birthday problem Publicity	5	1216	August 13, 2021
Sparse approximation for gaussian process Modeling	1	1120	July 24, 2020
Periodic Gaussian Process - covariance matrix not positive semidefinite General	7	4643	October 24, 2020

GP Basis Function Approximations for Product Kernels

Related topics