Gaussian Process Prediction

Hi, I am trying to fit a GP model. However, my predicted values do not see right. I am only interested in predicting and not forecasting. My stan code is from the user’s guide.

data {
  int<lower=1> N;
  real x[N];
  vector[N] y1;
}

transformed data {
  real delta = 1e-9;
}

parameters {
  real<lower=0> rho;
  real<lower=0> alpha;
  real<lower=0> sigma;
  vector[N] eta;
}

transformed parameters {
  vector[N] f;
  {
    matrix[N, N] L_K;
    matrix[N, N] K = cov_exp_quad(x, alpha, rho);

    // diagonal elements
    for (n in 1:N)
      K[n, n] = K[n, n] + delta;

    L_K = cholesky_decompose(K);
    f = L_K * eta;
  }
}
model {
  rho ~ inv_gamma(5, 5);
  alpha ~ std_normal();
  sigma ~ std_normal();
  eta ~ std_normal();

  y1 ~ normal(f[1:N], sigma);
}
generated quantities {
  vector[N] y2;
  for (n in 1:N){
    y2[n] = normal_rng(f[n], sigma);
  }
}

My R code is

ti <- seq(from = -4, to = 4, length.out = 100)

set.seed(111)

y_true <- sin(4*pi*ti) + sin(7*pi*ti)
y <- y_true + rnorm(n = length(ti))

dtw <- list(N = length(y), x = ti, y1 = y)
mod_long <- cmdstanr::cmdstan_model(stan_file = "GP_pred.stan")
fit_long <- mod_long$sample(data = dtw, seed = 111, 
                            chains = 4, parallel_chains = 4, iter_warmup = 1000, iter_sampling = 1000)
fit_long$summary(variables = c("rho", "alpha", "sigma"))
y_fit <- fit_long$draws(variables = "y2")
y2f <- posterior::summarize_draws(y_fit, "mean")

plot(x = ti, y = y, type = "p")
lines(x = ti, y = y_true, type = "l", lty = 2, col = "red")
lines(x = ti, y = y2f$mean, type = "l", lty = 1, col = "blue")
legend("topleft", legend = c("Y-true", "Y-hat"),
       col = c("red", "blue"), lty = 2:1, cex = 0.8)

The plot below is of the true and predicted values

1 Like

Remove the true line and it looks ok fit. You might need tighter (more informative) prior for lengthscale.

4 Likes

Thank you very much. Using a half standard normal prior works

1 Like