Periodic Gaussian Process - covariance matrix not positive semidefinite

Hi all,
I am trying to model a periodic function as a Gaussian process. The modelled function has not other constraints than the need to be periodic and smooth. My initial idea was to use squared exponential kernel, and only redefine the distance metric, so that it treats the time variable as a circle. However, for certain combinations of parameters this fails to yield a positive semi-definite matrix. Any idea what might be wrong? Or a pointer to some literature on GPs using different distance metrics than Euclidean distance?

Thanks a lot!

Here is R code that shows the problem (I encountered this why preparing simulated data for a Stan model in R)

generate_periodic_gp <- function(time, scale, length) {
  # Construct the squared exponential covariance matrix
  cov_matrix = array(0,c(length(time), length(time)));
  maxTime = max(time)
  minTime = min(time)
  for(i in 1:length(time))  {
    cov_matrix[i,i] = scale^2 + 0.00000001; #Adding jitter to ensure positive semi-definiteness
    if (i < length(time)) {
      for(j in (i+1):length(time)) {
        #Take either the normal distance or the distance wrapped around the edge, whichever is shorter
        distance = min(abs(time[i] - time[j]), min(time[i], time[j]) - minTime + maxTime - max(time[i], time[j]))

        covariance = (scale^2) * exp( (-0.5 / (length ^ 2)) * (distance ^ 2) );
        cov_matrix[i,j] = covariance
        cov_matrix[j,i] = covariance
      }
    }
  }
  # Draw from the multinormal distribution using the cholesky decomposition of the cov. matrix
  chol_cov = t(chol(cov_matrix));
  values = chol_cov %*% rnorm(length(time));
  return(values)
}

times = seq(1,10, by =0.1)
values = generate_periodic_gp(times,1,0.7)
plot(times, values, type ="l") #Works as expected, gives a smooth curve and values[1] == values[length(values)]
values = generate_periodic_gp(times,1,0.8) #Fails, not positive definite
1 Like

@flaxter has committed the crime of talking about periodic kernels in the last few days, so maybe he can give better comments here, but I’ll try to just provide pointers.

Check out: Approximate GPs with Spectral Stuff

The Stan model at the top has Fourier in the name. Not sure what it is, but anything Fourier seems like it’d be good for periodic stuff. The arxiv paper on the Poisson intensity estimation he links there has another kernel defined on a periodic domain.

I just Googled periodic RBF and this page came up: http://www.cs.toronto.edu/~duvenaud/cookbook/index.html . It has some sorta strange periodic kernel there that might work!

1 Like

Thanks for the help - periodic kernel from http://www.cs.toronto.edu/~duvenaud/cookbook/index.html does the trick. Actually, I played before with the periodic kernel from Wikipedia (very slightly different formulation) and it gave weird results, but it must have been an implementation bug on my side, because now it works :-)

For future travellers, the formula for the periodic kernel is:

covariance[i,j] = (scale^2) * exp(-2*(sin(PI * abs(x[i] - x[j]) / period)^2) / (length ^ 2))

Thanks again!

4 Likes

Sorry for the late reply–you could also use the spectral mixture kernel (https://arxiv.org/pdf/1302.4245.pdf). Here is Stan code for a two component mixture (i.e. an additive kernel with two different periodic components) in case it’s helpful to someone.

  matrix[n1, n1] Sigma1;

  for (i in 1:n1) {
    Sigma1[i, i] <- var1 + var2;
    for (j in (i+1):n1) {
      Sigma1[i, j] <- var1 * exp(-(x1[i]-x1[j])^2*bw1) * cos(twopi * fabs(x1[i] - x1[j]) * period1) +
                      var2 * exp(-(x1[i]-x1[j])^2*bw2) * cos(twopi * fabs(x1[i] - x1[j]) * period2);
      Sigma1[j, i] <- Sigma1[i, j];
    }
  }
1 Like

Sorry to resurrect such an old thread, but just started playing with a periodic-kernel GP and I’m trying to wrap my head around how the period parameter is differentiated from the lengthscale parameter. I get that the thing that’s repeating every period might be irregularly shaped, like the example from the kernel cookbook linked earlier:
image

in which case lengthscale will determine the wiggliness inside the repeating bit and period will determine how often it repeats. But if we’re dealing with a repeating signal that is less complex, like a sine wave at the extreme of simplicity, I feel like lengthscale isn’t going to do any “work” anymore, and threatens to become non-identified from period, right?

1 Like

I honestly don’t know - in the application I was trying to use it, the period was fixed, so it was not an issue. My guess is that differentiating period and lengthscale could really be difficult and you should probably at least constrain period to be a priori much shorter than the observed data range while being much wider then the distance between several consecutive observations. At the same time you probably should constrain lenghtscale to not be much larger then period and (similarly to quadratic exponential kernel) longer the the distance between two consecutive observations.

1 Like

If we’re interpreting period as the distance on the x-axis between repetitions, it needs to be /period1 not *period1, right? The corrected code being:

matrix[n1, n1] Sigma1;

  for (i in 1:n1) {
    Sigma1[i, i] <- var1 + var2;
    for (j in (i+1):n1) {
      Sigma1[i, j] <- var1 * exp(-(x1[i]-x1[j])^2*bw1) * cos(twopi * fabs(x1[i] - x1[j]) / period1) +
                      var2 * exp(-(x1[i]-x1[j])^2*bw2) * cos(twopi * fabs(x1[i] - x1[j]) / period2);
      Sigma1[j, i] <- Sigma1[i, j];
    }
  }```

Looks good to me.