Multi-dimensional Gaussian process as a prior

Dear Stan community,

I have been for the first time studying about Gaussian process. I initially decided to infer a property from a material given a set of data. I’ll refer to this property here as P. This property varies spatially. My problem is unidimensional, so P = P(x) where x denotes the spatial coordinate. My objective was then to infer a vector of parameters \vec{P} = [P_1, P_2, ..., P_{I}], where P_i = P(x_i) and x_i denotes a node from the grid I use to discretize my spatial domain. Besides, I is the number of nodes. I decided to use a Gaussian process to model the prior regarding the elements in \vec{P}.

The code that I have is similar to this:

functions {
  // Define here some useful functions
}

data {
  int n_data; // Number of data
  vector[n_data] D_vec; // Vector with data
  int I; // Number of nodes used to discretize the spatial domain
  real x_vec[I] // Numerical grid. For example, [0, 0.25, 0.5, 1], where the
                // spatial domain from x = 0 to x = 1 is discretized by
                // four equally spaced nodes
}

parameters {
  vector[I] P_vec; // Vector of parameters to be estimated
  real<lower=0> sigma; // Hyperparameter for the GP
  real<lower=0> length_scale; // Another hyperparameter for the GP
}

model {

  // Define here some model for the likelihood
  
  // Prior (GP)
  vector[I] mu = rep_vector(0, n);
  matrix[n, n] K = gp_exp_quad_cov(x, sigma, length_scale);
  K = add_diag(K, 1e-9);
  matrix[n, n] L_K = cholesky_decompose(K);
  P_vec ~ multi_normal_cholesky(mu, L_K);

  // Hyperpriors
  sigma ~ normal(0, 1);
  length_scale ~ normal(0, 1);
} 

The example above was able to provide me good estimates of \vec{P}, which represent local values of the function P(x). Now I want to extend this idea to a two-dimensional case, that is, P = P(x,y). As far as I understand, I am supposed to use a multi-dimensional Gaussian process. According to the manual, the input for such case should be an array of vectors, and not an array of scalars as show above. It is not very clear to me how I am supposed to define this array of vectors.

Besides, I immediately think about not considering a vector \vec{P}, but a matrix \mathbf{P} with dimension I \times J, where each element P_{ij} is P_{ij} = P(x_i, y_j), with i=1,\dots,I and j=1,\dots,J. So I believe it makes sense to define \vec{P} as something like

P_vec = reshape(P_matrix, I * J, 1)

but I am not sure if this how I should model. It would be great if someone could give me some tips about how to proceed.

Thanks in advance for the attention!

Best,
Rodrigo

Our best doc on this is in the User’s Guide: 10 Gaussian Processes | Stan User’s Guide

But I see there aren’t any multidimensional examples! The idea is just that you generalize the kernel computation. For example, that is:

K[i, j] = exp(-0.5 * square(x[i] - x[j]));

in the example here: 10.2 Simulating from a Gaussian process | Stan User’s Guide

Instead, you could build something that is exp(-0.5 * dot_self(x[i] - x[j])). The dot-self is the squared Euclidean distance between x[i] and x[j] and can also be coded as squared_distance(x[i], x[j]).

The exact function defining your covariance will depend on which kind of GP you are defining.

Sorry this took so long to get an answer!

2 Likes

Hi Bob,

Thank you very much for your reply, I really appreciate it! And there’s definitely no need to apologize.

I did understand the idea of generalizing the kernel computation. I was mostly facing some issues while implementing this idea while using the gp_exp_quad_cov function. I read that I am supposed to consider my input as an array of vectors, and not an array of scalars as used for the unidimensional case. Turns out that I was declaring my array as array[2] vector[N] my_array, where N is the number of nodes, whereas I should actually declare as array[N] vector[2] my_array. A silly mistake that gave me some headaches! :)

One more time, thanks for the help!

Best,
Rodrigo