Hey all -

@fabio @Kon7

So here’s a “binary classification with GP”, or a “Gaussian prior, with a logistic likelihood”.

We can write this in math as follows, the full posterior being:

\int p(y | X, f) p(f | x) p(x |\theta)

With a non-gaussian likelihood functions, we can’t integrate/marginalize out the latent f, intsead, we use it as the parameter for the logit likelihood function:

p(y |X, f) = log(\frac{f}{1-f})

I believe this is correct, correct me if not. p(f |x) \sim GP(x, x'), p(x|\theta) will be the specified covariance function (sometimes an RBF).

The stan code is below. I’m using `x_pred`

for out of sample data (really could be in between two datapoints, which happens in spatial or temporal data, whatever). **NOTE**: Here I’m not using the GP’s covariance, I’m *only using the latent mean function, f* in the likelihood function.

```
functions {
vector gp_pred_rng(vector[] x_pred,
vector y1, vector[] x,
real magnitude, real[] length_scale) {
int N = rows(y1);
int N_pred = size(x_pred);
vector[N_pred] f2;
{
matrix[N, N] K = gp_matern52_cov(x, magnitude, length_scale);
matrix[N, N] L_K = cholesky_decompose(K);
vector[N] L_K_div_y1 = mdivide_left_tri_low(L_K, y1);
vector[N] K_div_y1 = mdivide_right_tri_low(L_K_div_y1', L_K)';
matrix[N, N_pred] k_x_x_pred = gp_matern52_cov(x, x_pred, magnitude, length_scale);
f2 = (k_x_x_pred' * K_div_y1);
}
return f2;
}
}
data {
int<lower=1> N;
int<lower=1> D;
vector[D] x[N];
int<lower=0,upper=1> y[N];
int<lower=1> N_pred;
vector[D] x_pred[N_pred];
}
parameters {
real<lower=0> magnitude;
real<lower=0> length_scale; // Next release, we can use length_scale[D] for multiple length scales, (ARD)
vector[N] eta;
}
transformed parameters {
vector[N] f;
{
matrix[N, N] K;
matrix[N, N] L_K;
K = gp_matern52_cov(x, magnitude, length_scale);
K = add_diag(K, 1e-12);
L_K = cholesky_decompose(K);
f = L_K * eta;
}
}
model {
magnitude ~ // what should the magnitude prior be?
length_scale ~ // what should the length scale prior be?
eta ~ normal(0, 1);
y ~ bernoulli_logit(f);
}
generated quantities {
vector[N_pred] f_pred = gp_pred_rng(x_pred, f, x, magnitude, length_scale, sig);
int y_pred[N_pred];
int y_pred_in[N];
for (n in 1:N) y_pred_in[n] = bernoulli_logit_rng(f[n]); // in sample prediction
for (n in 1:N_pred) y_pred[n] = bernoulli_logit_rng(f_pred[n]); // out of sample predictions
}
```

So for the trinary classification case, give me some time. It will require a bit of thought.