Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming

We have a new paper Gabriel Riutort-Mayol, Paul-Christian Bürkner, Michael R. Andersen, Arno Solin, Aki Vehtari (2020). Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming. https://arxiv.org/abs/2004.11408 with code in Stan and brms

The approximation itself is not new as it is based on the Hilbert space basis function approach by Solin and Särkkä https://doi.org/10.1007/s11222-019-09886-w

The present paper adds

  • analysis of the performance and diagnostics in relation to the number of basis functions, domain of the prediction space, and smoothness of the latent function
  • practical implementation details for Stan

In addition of simulations, we re-create two models from BDA3 (which were made with GPstuff):

  • birthday model with 7305 observations in one dimension
  • leukemia survival model with 1043 observations in four dimensions with full interactions

In addition of having code for Stan at https://github.com/gabriuma/basis_functions_approach_to_GP, this approximation has been for some time in brms https://rdrr.io/cran/brms/man/gp.html, but we haven’t been advertising it much before we got the paper ready.

When GPs are implemented using covariance matrices, the computation cost is O(n^3) for every log density evaluation. This reduced rank approximation has pre-computation cost O(m^2n) for m basis functions, but after that the computational cost for log density evaluations is O(mn + m). The number of basis functions needed depends on the wigglines of the target and the number of dimensions. The scaling in dimensions is exponential restricting the practical use for 1-4 dimensions with full interactions. With additive models the scaling is linear in the number of additive components. The additional benefits of the basis function approximation in Stan is the reduced size of the autodiff tree and that we can use compound glm functions in for additional speedup.

EDIT: added a paragraph on computational cost.

16 Likes