I found an error in Algorithm 5.5 of my thesis which describes how to make predictive draws with the integrated Laplace approximation. This affects out-of-sample predictions, which I didn’t use in any applications (which might be why I didn’t catch this, though I should’ve built a better unit test).
Error and fix
When fitting Gaussian processes it is common to distinguish training data, X, and testing data, X^*, and to have latent Gaussian variables associated for each, respectively \theta and \theta^* – see Rasmussen & Williams (2006, Section 3.4.2). After marginalizing out \theta, we may want posterior draws for both \theta and \theta^*. The case for \theta is relatively straightforward.
For \theta^*, we need to examine the “extended” prior covariance matrix,
I’ll denote K = K(X, X), K^* = K(X, X^*) and K^{**} = K(X^*, X^*). We could abstract the problem a little, with
where the different components of K are arbitrarily specified by the user (i.e. they’re not the result of one fixed covariance function applied to different X's).
As written in the thesis, the mean of the approximating normal is
The error creeps into the calculation of the covariance, which is
I erroneously wrote the first term on the RHS as K^* and carried the mistake over in my code :( Fortunately this is an easy fix.
API
To draw posterior samples for \theta^*, we need to compute K, K^* and K^{**}. That’s three potentially different covariance functions. The current approach is to limit users to only one covariance function (...) ==> matrix
, using variadic arguments but allows users to change the arguments that get passed when computing K and K^*. A call to the function may look as follows
vector[n_obs] theta
= laplace_marginal_rng(y, n_samples, ye, theta_0,
K_functor,
forward_as_tuple(X, X),
forward_as_tuple(X, X_pred),
alpha, rho);
The idea is that K is computed using
K = K_functor(X_train, X_train, alpha, rho);
and K^* using
K_p = K_functor(X_train, X_pred, alpha, rho);
with the last two variadic arguments, alpha
and rho
used for both calls. Applying the fix to our algorithm, we also need to specify how to compute K^{**}. So we need a third set of arguments, and the call is
vector[n_obs] theta
= laplace_marginal_rng(...
K_functor,
forward_as_tuple(X, X),
forward_as_tuple(X, X_pred),
forward_as_tuple(X_pred, X_pred),
alpha, rho);
Hmmm… ok. The only annoying thing about this is that if we want posterior draws for \theta, we need to pass forward_as_tuple(X, X)
three times. What’s the fix here? Should we overload the function? Create two rng functions?
What if we need a more flexible solution?
Right now, the only way to distinguish K, K^*, and K^{**} is to pass different arguments to the same covariance function. This might not be flexible enough. We might create a function that allows the user to pass three distinct covariance functor, but then we lose variadic arguments.
The one solution that comes to mind is to treat \theta^* as regular latent Gaussian variables with empty groups. So the covariance functor returns the whole covariance matrix \bf K (and the likelihood is rewritten accordingly). That gives us all the flexibility we want, but it increases the dimension of the problem from n to n^* – with \theta \in \mathbb R^n and \theta^* \in \mathbb R^{n^*} – with the complexity of the approximation being \mathcal O(n^3).
Proposition
Have an API for the rng function which allows three different set of arguments for K_functor
and overload the function for the case where we pass only one set of arguments (i.e. K^{**} = K^* =K).