Gaussian Process out-of-sample predictive distribution

Domain has a different mathematical meaning here. When GP people use domain, they’re usually talking about in a signal processing sense, as in the time domain and frequency domain. Sure, you can use Fourier transforms and inverse Fourier transforms to get back and forth between the two.

But I think you mean predictions “forward in time”. You can do this pretty trivially. Say you computed covariance matrix for times [1,2,3]. You can predict forward in time by setting \tilde{x} = [1,2,3,4,5]. For two evenly spaced points ahead. Not evenly spaced points? Try this: \tilde{x} = [1,2,3,4.2312,7.777]. Then compute cross covariance matrix and use it in the predictive equations you found in GPML. I guess if you’ve defined dom(x) = [1,5] \subset \mathbf{R} , where this is an closed interval from 1 to 5, then above has points that are not in our defined domain. It’s pretty much the same process. Be careful trying to predict the future!

For in between time points, you can see the R code above. I set \tilde{x} to seq(0,100, length.out = 1000).

EDIT: made language more mathematically rigorous because it was ambiguous. (“changed above is outside domain” to “above has points that are not inside our defined domain”). I also fixed a typo in the R snippet.

Sorry to resurrect an old post. But I believe the code attached is wrong.

Comparing to what it shows in the manual https://mc-stan.org/docs/stan-users-guide/gaussian-processes.html#analytical-form-of-joint-predictive-inference, the correct code should be

//https://mc-stan.org/docs/stan-users-guide/gaussian-processes.html#predictive-inference-with-a-gaussian-process
//https://likelyllc.com/2022/08/03/additive-gaussian-process-time-series-regression-in-stan/

functions {
   
  vector gp_pred_rng(array[] vector x2,
                     vector y, 
                     array[] vector x1,
                     real m, 
                     real l,
                     real sig,
                     real sigma,
                     real delta) {
    int N1 = rows(y);
    int N2 = size(x2);
    vector[N2] f2;
    {
      matrix[N1, N1] K = add_diag(gp_exp_quad_cov(x1, m, l) +  gp_dot_prod_cov(x1, sig), square(sigma));
      matrix[N1, N1] L_K = cholesky_decompose(K);
      vector[N1] L_K_div_y = mdivide_left_tri_low(L_K, y);
      vector[N1] R_K_div_y = mdivide_right_tri_low(L_K_div_y', L_K)';
      matrix[N1, N2] k_x1_x2 = gp_exp_quad_cov(x1, x2, m, l) + gp_dot_prod_cov(x1, x2, sig);
      vector[N2] f2_mu = (k_x1_x2' * R_K_div_y);
      matrix[N1, N2] v_pred = mdivide_left_tri_low(L_K, k_x1_x2);
      matrix[N2, N2] cov_f2 = gp_exp_quad_cov(x1, x2, m, l) + gp_dot_prod_cov(x1, x2, sig) - v_pred' * v_pred;
      matrix[N2, N2] diag_delta = diag_matrix(rep_vector(delta, N2));
      f2 = multi_normal_rng(f2_mu, cov_f2 + diag_delta);
    }
    return f2;
  }
}
data {
  int<lower=1> N;
  int<lower=1> D;
  array[N] vector[D] x;
  vector[N] y;

  int<lower=1> N_pred;
  array[N_pred] vector[D] x_pred;
}
transformed data {
  real delta = 1e-9;
  vector[N] mu = rep_vector(0, N);
}
parameters {
  real<lower=0> l;
  real<lower=0> m;
  real<lower=0> sig0;

  real<lower=0> sigma;
}
transformed parameters {
  matrix[N, N] L_K;
  {
    matrix[N, N] K = gp_exp_quad_cov(x, m, l) + gp_dot_prod_cov(x, sig0);
    
    K = add_diag(K, square(sigma));
    L_K = cholesky_decompose(K);
  }
}
model {
  m ~ normal(0, 1);
  l ~ normal(0, 1);
  sig0 ~ normal(0, 1);

  sigma ~ normal(0, 1);
  
  y ~ multi_normal_cholesky(mu, L_K);
}
generated quantities {
  
   vector[N_pred] f_pred;
  vector[N_pred] y_pred;

  f_pred = gp_pred_rng(x_pred, y, x, m, l, sig0, sigma, delta);
  for (n2 in 1:N_pred) {
    y_pred[n2] = normal_rng(f_pred[n2], sigma);
  }
  
}


While the median value of the original code is the same, the interval is wider with the newer code because it accounts for all the sources of errors.