Normalizing constant for embedded Laplace approximation

Do we need the normalizing constant of the likelihood when doing an embedded Laplace approximation? It seems that we can drop the constant and make computations a bit more efficient.

Given a model

\phi \sim \pi(\phi) \\ \eta \sim \pi(\eta) \\ \theta \mid \phi \sim \text{Normal}(0, K(\phi)) \\ y \mid \theta, \eta \sim \pi(y \mid \theta, \eta)

and introducing the negative Hessian, W = - \nabla^2_\theta \log \pi(y \mid \theta^*, \eta), the approximate marginal likelihood is

\log \pi_\mathcal{G} (y \mid \phi, \eta) = - \frac{1}{2} \theta^{*T} K^{-1} \theta^* + \log \pi(y \mid \theta, \eta) - \frac{1}{2} \log |K||K^{-1} + W|.

When drawing approximate samples from the marginal posterior \pi(\phi, \eta \mid y), we don’t need the normalizing constant in \log \pi(y \mid \theta^*, \eta). Our current unit tests benchmark against GPStuff, which as far as I can tell computes the normalizing constant. Is there a reason for this?

Edit: making corrections to the equations, following Aki’s suggestion.

@avehtari

3 Likes

I think \log \pi(y \mid \theta, \eta) should be \log \pi(y \mid \theta^*, \eta)

I think \pi(\phi \mid y) should be \pi(\phi, \eta \mid y)

I think \log \pi(y \mid \theta, \eta) should be \log \pi(y \mid \theta^*, \eta). As \theta^* is fixed, the normalization term with respect to \theta^* is constant and not needed for MCMC, but in most cases the computational cost is negligible compared to matrix and Laplace operations, and it was easier to make comparisons to full MCMC when not dropping those terms. For models with \eta as parameter, the normalization term depends on \eta and needs to be included.

2 Likes

Ok, let’s plan to keep the normalizing constant for now. I’ll do some checks to examine performance differences.

1 Like