Fitting a mixture of Gaussian processes


I have already fit a mixture of multivariate normals in Stan to some simulated data and managed to recover the ‘true’ values. I’m now wanting to fit a mixture of Gaussian processes (each GP will have it’s own distinct parameters for it’s mean and covariance functions) in Stan but I’m not sure if this is even possible without discrete cluster indicator variables.
I’m wanting to fit the model to data given below and infer the parameters of each GP. The data is simulated from two GPs with different mean functions.
The following code runs but we don’t recover the parameters of the two GPs. I was wondering if anyone had any thoughts on how I could implement this in Stan (or if it’s even possible).
Thanks in advance.

functions {
  vector mean_fun(vector theta,vector beta,int n,int p){
  matrix[n,p] X;
  for(i in 1:n){
 matrix cov_fun(vector theta, real r, real nu, real sigma, int n){
   matrix[n,n] out;
   for(i in 1:n){
      for(j in i:n){
         out[i,j] = exp(-((theta[i]-theta[j])/r)^2);
         if(i==j) out[i,i] = out[i,i]+nu;
         out[i,j] = sigma*out[i,j];
         out[j,i] = out[i,j];
  int<lower=0> K;
  int<lower=0> n;
  int<lower=0> d;
  int<lower=0> p;
  vector[n] x;
  vector[n] theta;
parameters {
  vector[p] beta_mix[K];
  real<lower=0> r_mix[K];
  real<lower=0> nu_mix[K];
  real<lower=0> sigma_mix[K];
  simplex[K] Pi;
model {
  real ll[K];
  for(j in 1:K){
    r_mix[j] ~ lognormal(-1,0.5);
    sigma_mix[j] ~ inv_gamma(1,0.5);
    nu_mix[j] ~ inv_gamma(1,0.5);
    beta_mix[j] ~ normal(0,1);
  for(j in 1:K){
  target += log_sum_exp(ll);

data.txt (540 Bytes)


Do you have any prior information about the two categories? For example, the plot shows that one seems to have a different latent mean function than the other; is that just for data simulating, or does it represent a genuine expectation?


Thanks for the reply. No, there’s no prior information about the two categories.


You might want to check out Michael Betancourt’s case study on mixture models. They can be hard to identify even in much simpler cases than GPs.

Do the two GP functions have different covariance functions? If not, it seems like they could be combined. I usually see GPs done with zero mean (the mean gets subtracted out elsewhere before the multivariate normal is applied).