diagSPD for Matérn GP kernels other than 3/2

Hi everyone,

I’ve been fitting some Gaussian Process models using the Hilbert space approximation, adapting code from the tutorials by @avehtari here and here. I would like to fit a model which uses a Matérn 1/2 kernel, but I am not sure how to adapt the diagSPD_Matern32 function to the other Matérn kernels - any assistance with this would be greatly appreciated.

Here is the function in question:

vector diagSPD_Matern32(real alpha, real rho, real L, int M) {
    return 2*alpha * (sqrt(3)/rho)^1.5 * inv((sqrt(3)/rho)^2 + ((pi()/2/L) * linspaced_vector(M, 1, M))^2);
   }

Thanks!

See discussion here: Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming - #5 by js592

2 Likes

FYI this was recently implemented in brms (you may need to install from github if it hasn’t made it into a release yet). You could use a formula like

y ~ gp(x, cov="exponential", k=10)

where k is the number of basis functions. Depending on your needs you could either use the brms model directly or use make_stancode() and copy the relevant functions to your own Stan model.

2 Likes

Here’s what we think it is (any review / error spotting welcome, see related PR):

2 Likes

That’s great, thanks!

Do you have a link to the maths vingette?

My implementation looks like:

vector diagSPD_exp(real alpha, real rho, real L, int m) {
    vector[m] lambda = (linspaced_vector(m, 1, m)*pi()/(2*L))^2;
    vector[m] diag = 2*alpha*rho/(1+rho^2*sqrt(lambda)^2);
    diag = diag.^(0.5);
    
    return diag;
}

testing both with alpha = 1, rho = 0.5, L = 1, m = 8 I got:

[0.847367,0.748398,0.677581,0.623686,0.580895,0.545854,0.516474,0.491379] (yours)
vs
[0.786439,0.537029,0.390683,0.303314,0.246773,0.207584,0.178955,0.157177] (mine)

I’m actively using this code in my work so I want to make sure I didn’t make any mistakes in my derivation :)

Edit: I have narrowed down the difference to how the \lambda^2 term is coded. E.g. modifying:

vector diagSPD_Matern12(real alpha, real rho, real L, int M) {
  vector[M] indices = linspaced_vector(M, 1, M);
  real factor = 2;
  vector[M] denom = rho * ((1 / rho)^2 + (pi() / 2 / L)^2 * indices^2);
  return alpha * sqrt(factor * inv(denom));
}

gives identical
[0.786439,0.537029,0.390683,0.303314,0.246773,0.207584,0.178955,0.157177]
(or vice vesa if you change sqrt(lambda)^2 to sqrt(lambda) in the other function)

That being said, I need to dig a bit deeper to see which is correct.

2 Likes

So, there are a couple of things going on here. The first component in the approximation is the spectral density, which I derived as:
s(\omega) =\frac{2 \rho \alpha}{1+\rho^2 \omega^2} = \frac{2\alpha}{ \frac{1}{\rho}+\rho\omega^2} =\frac{2\alpha}{ \rho((\frac{1}{\rho})^2+\omega^2)}.

The second component is this spectral density evaluated at the square root of the eigenvalues:
s(\sqrt{\lambda}_m) where \lambda_m = (\frac{m \pi}{2L})^2
So, I think the following implementation is correct:
s(\sqrt{\lambda}_m) = \frac{2\alpha}{ \rho((\frac{1}{\rho})^2+(\frac{m \pi}{2L})^2)}

Then, this gets an additional square root when we are using the non-centered linear representation.

3 Likes

Yes, I think you’re right (and your maths matches our derivation in Gaussian Process implementation details • EpiNow2) - thanks!

3 Likes

Happy to help.

I tested the Matern 5/2 kernel some program and it got more fitting difficulties than the Matern 3/2 kernel (longer runtime). The results however are plausible.

Further I suggest to write the code slightly more homogeneous:

vector diagSPD_Matern12(real alpha, real rho, real L, int M) {
...
  return alpha * sqrt(factor * inv(denom));
}

vector diagSPD_Matern32(real alpha, real rho, real L, int M) {
..
  return factor * inv(denom);
}
vector diagSPD_Matern52(real alpha, real rho, real L, int M) {
...
  return alpha * sqrt(factor * inv(denom));
}

I just got confused because we have two times sqrt(inv(denom)) and one time inv(denom).
Nevertheless I wanted to check the math, but couldn’t find the reference. Do you have a link?
Thank you for the work, btw!

This one? [2004.11408] Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming