Fitting a gaussian process

Hi,

I’m worked through the Rethinking book and I’m a beginner with STAN and also with Gaussian processes. I’ve generated simulated data and now want to estimate the parameters used for the simulation. With the help of this forum, the simulation is working. Now I struggling with getting good fit. I wondering if someone can point me to the mistake I did. Here is my Code:

library(rethinking)
library(MASS)

# Generate distance matrix
set.seed(5)
N_PARTICLE=10
particle_positions  <- matrix(runif(2*N_PARTICLE, 0, 1),nrow=N_PARTICLE, ncol=2)
particle_distance_matrix <- as.matrix(dist(particle_positions))

# These are the parameters I want to get back:
eta_sq = 3 # Maximum covariance between any two particles
rho_sq = 10 # rate of decline ofcorrelation with distance

delta = 0.01 # don't now what it is, but this is used in the book

# Generate the covariance matrix
covariance_matrix <- matrix(, nrow = N_PARTICLE, ncol = N_PARTICLE)
# Using gaussian kernel
for (i in 1:(N_PARTICLE-1)) {
    covariance_matrix[i, i] <- eta_sq + delta;
    for (j in (i+1):N_PARTICLE) {
        covariance_matrix[i,j] = eta_sq * exp(-rho_sq*(particle_distance_matrix[i,j])^2)
        covariance_matrix[j,i] = covariance_matrix[i,j]
    }
}
covariance_matrix[N_PARTICLE, N_PARTICLE] = eta_sq + delta;

# Generate samples
N_SAMPLES <- 50
shifts <- mvrnorm(n=N_SAMPLES,mu=rep(0, N_PARTICLE), Sigma=covariance_matrix)

shifts <- as.vector(t(shifts)) # Flatten the matrix
particle_id <- rep(1:N_PARTICLE,N_SAMPLES)

# Fit the gaussian process
dat_list <- list(
    S = shifts,
    N = N_PARTICLE,
    particle = particle_id,
    Dmat = particle_distance_matrix )

m14.8 <- ulam(
    alist(
        S <- k[particle],
        vector[N]:k ~ multi_normal( 0 , SIGMA ),
        matrix[N,N]:SIGMA <- cov_GPL2( Dmat , etasq , rhosq , 0.01 ),
        etasq ~ dexp( 2 ),
        rhosq ~ dexp( 0.5 )
), data=dat_list , chains=4 , cores=4 , iter=2000 )

precis(m14.8)

The estimated parameters are:
image

Which is obviously different from what I’ve used during simulation. What did I wrong?

You might find this helpful: https://betanalpha.github.io/assets/case_studies/gaussian_processes.html#4_Advanced_Topics

It’s not in the rethinking syntax, but the case study is really good.

1 Like

That sound interesting and I gonna read. But I still wondering why I’m not able to get my parameters back. The situation seems to simple: Given samples from multi normal with a covariance matrix that was generated with known parameters I want to get parameters back. So either I’m doing something wrong when generating the covariance or my model is wrong.

I haven’t tried with your data, and I could be off track, but I noticed you put an exponential prior on both marginal variance and length-scale. In my experience working with a gamma or inverse gamma works better, particularly for the length-scale parameter since allowing rho to approach zero allows you to have a kind of infinitely flexible function. What exactly the right prior distribution will be will depend on how flexible the function you are trying to model. Simulating fake data from priors will help in any case.

Hi, thanks for your comment. I made multiple changes (including yours):

  1. I’m only using one sample per particle, as this allowed me the next change
  2. S is now distributed multi_normal (this improved the estimates).
  3. Changed the prior for rho from exp to gamma.

What I observed:

  • etasq is now estimated correctly At least the mean is aroudn the true value.
  • The estimated value for rhosq is roughly the half to what I used in the simulation (e.g. Simulation: 20 -> Estimated mean: 8.21, Simulation: 10 -> Estimated: 5.26)

This makes me believe that I simulating it in the wrong way?

Here the updated code:

# Generate distance matrix
set.seed(5)
N_PARTICLE=50
particle_positions  <- matrix(runif(2*N_PARTICLE, 0, 1),nrow=N_PARTICLE, ncol=2)
particle_distance_matrix <- as.matrix(dist(particle_positions))
#particle_distance_matrix <- matrix(runif(N_PARTICLE*N_PARTICLE, 3, 20),nrow=N_PARTICLE,ncol=N_PARTICLE)

eta_sq = 3 # Maximum covariance between any two particles
rho_sq = 10 # rate of decline ofcorrelation with distance
delta = 0.01 # don't now what it is, but this is used in the book
sigma_i_equals_j = 0 # irrelevant, as we only have on observation per particle

covariance_matrix <- matrix(, nrow = N_PARTICLE, ncol = N_PARTICLE)
# Using gaussian kernel
for (i in 1:(N_PARTICLE-1)) {
    covariance_matrix[i, i] <- eta_sq + delta;
    for (j in (i+1):N_PARTICLE) {
        covariance_matrix[i,j] = eta_sq * exp(-rho_sq*(particle_distance_matrix[i,j])^2)
        covariance_matrix[j,i] = covariance_matrix[i,j]
    }
}
covariance_matrix[N_PARTICLE, N_PARTICLE] = eta_sq + delta;

N_SAMPLES <- 1
shifts <- mvrnorm(n=N_SAMPLES,mu=rep(0, N_PARTICLE), Sigma=sqrt(covariance_matrix))

shifts <- as.vector(t(shifts)) # Flatten the matrix
particle_id <- rep(1:N_PARTICLE,N_SAMPLES)

dat_list <- list(
    S = shifts,
    N = N_PARTICLE,
    N_F = N_SAMPLES,
    particle = particle_id,
    Dmat = particle_distance_matrix,
    delta = delta
)

m14.8 <- ulam(
    alist(
        #S <- k[particle],
        vector[N]:S ~ multi_normal( 0 , SIGMA ),
        matrix[N,N]:SIGMA <- cov_GPL2( Dmat , etasq , rhosq , delta ),
        etasq ~ dexp( 0.25 ),
        rhosq ~ gamma( 10, 1 )
), data=dat_list , chains=4 , cores=4 , iter=2000 )

Here is the stan code of the model:

functions{


    matrix cov_GPL2(matrix x, real sq_alpha, real sq_rho, real delta) {
        int N = dims(x)[1];
        matrix[N, N] K;
        for (i in 1:(N-1)) {
          K[i, i] = sq_alpha + delta;
          for (j in (i + 1):N) {
            K[i, j] = sq_alpha * exp(-sq_rho * square(x[i,j]) );
            K[j, i] = K[i, j];
          }
        }
        K[N, N] = sq_alpha + delta;
        return K;
    }
}
data{
    int particle[50];
    int N_F;
    int N;
    vector[50] S;
    real delta;
    matrix[50,50] Dmat;
}
parameters{
    real<lower=0> etasq;
    real<lower=0> rhosq;
}
model{
    matrix[N,N] SIGMA;
    rhosq ~ gamma( 10/1 , 1/1 );
    etasq ~ exponential( 0.25 );
    SIGMA = cov_GPL2(Dmat, etasq, rhosq, delta);
    S ~ multi_normal( rep_vector(0,N) , SIGMA );
}

Thanks for your help! It’s highly appreciated.

I don’t have too much time to look at this in depth now, but I can answer one minor question posed in your code immediately - the delta variable is just a small constant added to the covariance matrix to avoid float point problems in computations. 0.01 is maybe high - I think it’s typical to use 1e-9 or 1e-10. Not that this is necessarily the key to your problem, but it’s helpful to understand all the various pieces and what they do.

I would again really recommend stepping through the Betancourt case study with your data and try to translate your model into that syntax, starting with fake data simulation and deterministic sampling, then building up to priors over marginal deviance and length-scale. That’s the best way I know of to learn about GP models at any rate, and the process will almost certainly illuminate any problems you have.

2 Likes