Mean function of Gaussian Process in a regression

Hi, this is a general modeling question, not Stan-specific.

Can someone provide some initiative explanation on how I should think about the mean function of a GP in a regression?

A standard regression: y = a + bx or y = a + bln(x)
A GP regression: y ~ MVN(GP(x)), with a mean and a kernel

If I am putting a GP on x, is it ok to simply assume the mean function to be 0? I often see this done in practice, but I don’t understand why it suffices. If I am used to using ln(x) in the regression, does it mean I should actually assume the mean function to be ln(x)? Thanks in advance.

I think it’s fair to think of the mean function as being a + bln(x).

y = a + bln(x) + f(x) + error, f ~ MVN(0, K(x, x’))

Should be the same thing as:

y = f(x) + error, f ~ MVN(a + bln(x), K(x, x’))

I believe.

Edit: The 0-mean function is often used mainly because ‘y’ is often standardized or mean-centered, the most common kernel functions are universal kernels and can approximate the mean function anyway, and the ‘hard’ part is estimating the GP part itself (so adding a mean function when describing GPs just complicates matters - The hard part is the kernel). Not to say the mean function is unimportant, per se, just that when teaching GPs, it’s not the novel or ‘hard’ problem, so it can be a bit of a distraction.

1 Like