Gaussian Process out-of-sample predictive distribution

anon79882417 · July 29, 2020, 8:26pm

Domain has a different mathematical meaning here. When GP people use domain, they’re usually talking about in a signal processing sense, as in the time domain and frequency domain. Sure, you can use Fourier transforms and inverse Fourier transforms to get back and forth between the two.

But I think you mean predictions “forward in time”. You can do this pretty trivially. Say you computed covariance matrix for times [1,2,3]. You can predict forward in time by setting \tilde{x} = [1,2,3,4,5]. For two evenly spaced points ahead. Not evenly spaced points? Try this: \tilde{x} = [1,2,3,4.2312,7.777]. Then compute cross covariance matrix and use it in the predictive equations you found in GPML. I guess if you’ve defined dom(x) = [1,5] \subset \mathbf{R} , where this is an closed interval from 1 to 5, then above has points that are not in our defined domain. It’s pretty much the same process. Be careful trying to predict the future!

For in between time points, you can see the R code above. I set \tilde{x} to seq(0,100, length.out = 1000).

EDIT: made language more mathematically rigorous because it was ambiguous. (“changed above is outside domain” to “above has points that are not inside our defined domain”). I also fixed a typo in the R snippet.

sonicking · December 6, 2024, 12:03am

anon79882417:

  vector gp_pred_dot_prod_rng(vector[] x2,
                              vector y, vector[] x1,
                              real m, real l,
                              real sig,
                              real sigma) {
    int N1 = rows(y);
    int N2 = size(x2);
    vector[N2] f;
    {
      matrix[N1, N1] K = add_diag(gp_exp_quad_cov(x1, m, l) +
                                  gp_dot_prod_cov(x1, sig), sigma);
      matrix[N1, N1] L_K = cholesky_decompose(K);
      vector[N1] L_K_div_y = mdivide_left_tri_low(L_K, y);
      vector[N1] K_div_y = mdivide_right_tri_low(L_K_div_y', L_K)';
      matrix[N1, N2] k_x1_x2 = gp_dot_prod_cov(x1, x2, sig);
      f = (k_x1_x2' * K_div_y);
    }
    return f;
  }

Sorry to resurrect an old post. But I believe the code attached is wrong.

Comparing to what it shows in the manual https://mc-stan.org/docs/stan-users-guide/gaussian-processes.html#analytical-form-of-joint-predictive-inference, the correct code should be

//https://mc-stan.org/docs/stan-users-guide/gaussian-processes.html#predictive-inference-with-a-gaussian-process
//https://likelyllc.com/2022/08/03/additive-gaussian-process-time-series-regression-in-stan/

functions {
   
  vector gp_pred_rng(array[] vector x2,
                     vector y, 
                     array[] vector x1,
                     real m, 
                     real l,
                     real sig,
                     real sigma,
                     real delta) {
    int N1 = rows(y);
    int N2 = size(x2);
    vector[N2] f2;
    {
      matrix[N1, N1] K = add_diag(gp_exp_quad_cov(x1, m, l) +  gp_dot_prod_cov(x1, sig), square(sigma));
      matrix[N1, N1] L_K = cholesky_decompose(K);
      vector[N1] L_K_div_y = mdivide_left_tri_low(L_K, y);
      vector[N1] R_K_div_y = mdivide_right_tri_low(L_K_div_y', L_K)';
      matrix[N1, N2] k_x1_x2 = gp_exp_quad_cov(x1, x2, m, l) + gp_dot_prod_cov(x1, x2, sig);
      vector[N2] f2_mu = (k_x1_x2' * R_K_div_y);
      matrix[N1, N2] v_pred = mdivide_left_tri_low(L_K, k_x1_x2);
      matrix[N2, N2] cov_f2 = gp_exp_quad_cov(x1, x2, m, l) + gp_dot_prod_cov(x1, x2, sig) - v_pred' * v_pred;
      matrix[N2, N2] diag_delta = diag_matrix(rep_vector(delta, N2));
      f2 = multi_normal_rng(f2_mu, cov_f2 + diag_delta);
    }
    return f2;
  }
}
data {
  int<lower=1> N;
  int<lower=1> D;
  array[N] vector[D] x;
  vector[N] y;

  int<lower=1> N_pred;
  array[N_pred] vector[D] x_pred;
}
transformed data {
  real delta = 1e-9;
  vector[N] mu = rep_vector(0, N);
}
parameters {
  real<lower=0> l;
  real<lower=0> m;
  real<lower=0> sig0;

  real<lower=0> sigma;
}
transformed parameters {
  matrix[N, N] L_K;
  {
    matrix[N, N] K = gp_exp_quad_cov(x, m, l) + gp_dot_prod_cov(x, sig0);
    
    K = add_diag(K, square(sigma));
    L_K = cholesky_decompose(K);
  }
}
model {
  m ~ normal(0, 1);
  l ~ normal(0, 1);
  sig0 ~ normal(0, 1);

  sigma ~ normal(0, 1);
  
  y ~ multi_normal_cholesky(mu, L_K);
}
generated quantities {
  
   vector[N_pred] f_pred;
  vector[N_pred] y_pred;

  f_pred = gp_pred_rng(x_pred, y, x, m, l, sig0, sigma, delta);
  for (n2 in 1:N_pred) {
    y_pred[n2] = normal_rng(f_pred[n2], sigma);
  }
  
}

While the median value of the original code is the same, the interval is wider with the newer code because it accounts for all the sources of errors.

Topic		Replies	Views
Generate replicated data for out-of-sample data for gaussian process Modeling gaussian-process	4	761	July 29, 2020
Modeling time dynamics with GP Modeling specification , gaussian-process	12	650	April 29, 2020
GP Docs - Predictive Inference for Gaussian Processes Proposals Developers gaussian-process , docu	6	857	February 5, 2024
Gaussian Process Regression with Student likelihood General gaussian-process	7	1018	February 8, 2019
Time series and Gaussian process Modeling gaussian-process	62	4572	December 20, 2019

Gaussian Process out-of-sample predictive distribution

Related topics