We (JP Onnela and I) have been working on a library to scale up Gaussian process inference in Stan using two methods:

- Sparse approximations using directed acyclic graphs to encode which dependencies to retain. The method is quite flexible, e.g., nearest-neighbor GPs are a special case of this approximation.
- Fourier methods to evaluate the likelihood exactly if the observation points are on a regular grid (in 1D and 2D), e.g, for regularly sampled time series.

Both methods are available as “centered” (evaluating or approximating the likelihood of a realization directly) and “non-centered” (transforming white noise to a realization) parameterizations for performant inference in “strong” and “weak” data regimes, respectively.

Extensive documentation is available at https://gptools-stan.readthedocs.io, the source code can be found at GitHub - onnela-lab/gptools: Gaussian processes on graphs and lattices in Stan and pytorch., and an accompanying preprint with details and examples is available at [2301.08836] Scalable Gaussian Process Inference with Stan.

We’d love to get feedback on the library and preprint from members of the community.

Some highlights (or at least things we think are interesting):

- Typical runtime for 10k observations is ~15 seconds on my laptop using Fourier methods (without inferring the length scale and amplitude of the kernel). But this of course depends on the details of the model.
- We discuss the importance of choosing parameterizations in some depth. For sampling-based inference, this is (primarily) a performance concern. But for variational inference, picking an appropriate parameterization is even more important because it affects the quality of the approximation (see figure 2).
- We provide some guidance for practitioners to choose which method (standard GP inference with a full covariance matrix, sparse approximation, Fourier methods) and parameterization (centered, non-centered) to choose (see figure 5).
- Because GPs with squared-exponential kernel are
*very*smooth, their Fourier coefficients at high frequencies are small. By using non-centered Fourier methods and discarding high-frequency terms, we can often reduce the number of parameters substantially without significantly affecting the GP (see figure 1). Effectively just a low-pass filter on the process. - The library is pip-installable as
`pip install gptools-stan`

for use with`cmdstanpy`

. We’re keen to also publish a version to CRAN for use with`cmdstanr`

but my R-skills are ≈ 0. - The library implements real FFTs in one and two dimensions to remove redundant complex conjugate Fourier coefficient pairs (although using the full FFT under the hood rather than a more performant implementation).
- A
`pytorch`

version is also available for use with`pyro`

in the same repository. - As part of developing the documentation, we also published sphinx-stan — sphinx-stan documentation, a sphinx extension to automatically document Stan code. It supports inline function definitions, loading documentation from
`.stan`

files, and cross referencing, including the handling of overloaded functions.

Thank you to @WardBrian for help with Fourier transforms, include directives, and general advice on `cmdstanpy`

.