The manual offers a nice example for a single GP:
data {
int<lower=1> N1;
array[N1] real x1;
vector[N1] y1;
int<lower=1> N2;
array[N2] real x2;
}
transformed data {
real delta = 1e-9;
int<lower=1> N = N1 + N2;
array[N] real x;
for (n1 in 1:N1) {
x[n1] = x1[n1];
}
for (n2 in 1:N2) {
x[N1 + n2] = x2[n2];
}
}
parameters {
real<lower=0> rho;
real<lower=0> alpha;
real<lower=0> sigma;
vector[N] eta;
}
transformed parameters {
vector[N] f;
{
matrix[N, N] L_K;
matrix[N, N] K = gp_cov_exp_quad(x, alpha, rho);
// diagonal elements
for (n in 1:N) {
K[n, n] = K[n, n] + delta;
}
L_K = cholesky_decompose(K);
f = L_K * eta;
}
}
model {
rho ~ inv_gamma(5, 5);
alpha ~ std_normal();
sigma ~ std_normal();
eta ~ std_normal();
y1 ~ normal(f[1:N1], sigma);
}
generated quantities {
vector[N2] y2;
for (n2 in 1:N2) {
y2[n2] = normal_rng(f[N1 + n2], sigma);
}
}
Source: Gaussian Processes
Specifically, this line is the key line to specify the kernel
matrix[N, N] K = gp_cov_exp_quad(x, alpha, rho);
Now, if I have multiple kernels, can I just add them together in this line and keep the rest of the code intact? That is, I mean:
matrix[N, N] K = gp_cov_exp_quad(x, alpha, rho) + gp_dot_prod_cov(x, sigma) + gp_exponential_cov(x, sigma2, length_scale);
I am asking because this seems too good to be true, but I am not truly sure. Thank you.