Gaussian process regression error (rejecting initial value)

Patrick · June 24, 2017, 6:42pm

Hello Stan users,

I am trying to re-purpose some of Rob Trangucci’s excellent code on Gaussian processes but I am getting some errors and I am not sure what to do https://github.com/rtrangucci/multi_output_gps

Rejecting initial value:
Error evaluating the log probability at the initial value.

Here is the modified code. I originally commented out some of the program thinking that was the issue but I am still getting the same error. The big change that I am working on is that I need a Poisson model where I can observe covariate level data at the observation level. The end result being that I would have observation level effects that are independent of the GP grid.

functions {
  matrix gp_exp_quad_chol(real[] x, real alpha, real len, real jitter) {
    int dim_x = size(x);
    matrix[dim_x, dim_x] L_K_x;
    matrix[dim_x, dim_x] K_x = cov_exp_quad(x, alpha, len);
    for (n in 1:dim_x)
      K_x[n,n] = K_x[n,n] + jitter;
    L_K_x = cholesky_decompose(K_x);
    return L_K_x;
  }
}
data {
  // Model is for outcomes that have been
  // observed on a 2 dimensional grid
  int<lower=1> N; // Number of observations
  int<lower=1> K; // Number of predictors
  int<lower=1> dim_x_1; // size of grid dimension 1
  int<lower=1> dim_x_2; // size of grid dimension 2
  real x_1[dim_x_1]; // locations - 1st dimension
  real x_2[dim_x_2]; // locations - 2nd dimension
  int<lower=0> y[N];  // Outcome
                      //matrix[N, K] X; // predictor matrix

  int row_i[N]; // row position of each observation
  int col_j[N]; // col position of each observation
  

}
parameters {
                                //real w0; //intercept
                                //vector[K] beta; //PH predictors
  real<lower=0> len_scale_x_1; // length-scale - 1st dimension
  real<lower=0> len_scale_x_2; // length-scale - 2nd dimension
  real<lower=0> alpha; // Scale of outcomes

  // Standardized latent GP
  matrix[dim_x_1, dim_x_2] y_tilde;
}
model {
  matrix[dim_x_1, dim_x_2] latent_gp; 
  {
    matrix[dim_x_1, dim_x_1] L_K_x_1 = gp_exp_quad_chol(x_1, 1.0, len_scale_x_1, 1e-12);
    matrix[dim_x_2, dim_x_2] L_K_x_2 = gp_exp_quad_chol(x_2, alpha, len_scale_x_2, 1e-12);

    // latent_gp is matrix-normal with among-column covariance K_x_1
    // among-row covariance K_x_2
    
    latent_gp = L_K_x_1 * y_tilde * L_K_x_2';
  }
  // priors
  len_scale_x_1 ~ gamma(8, 2);
  len_scale_x_2 ~ gamma(8, 2);
  alpha ~ normal(0, 1);
  to_vector(y_tilde) ~ normal(0, 1);
                                    // weakly informative prior for the intercept
                                    //w0 ~ normal(0,10);
                                    //Weakly informative prior for the other variables in data
                                    //beta ~  normal(0,10); //w0 + X * beta +

  // likelihood 
   y ~ poisson_log(to_vector(latent_gp[row_i, col_j]'));
}

Thanks for any help you can provide!

-Patrick

bbbales2 · June 24, 2017, 7:06pm

I think your problem is with this:

y ~ poisson_log(to_vector(latent_gp[row_i, col_j]'));

The multiple indexing like that returns a matrix, not an array (I think this copies after how R does its matrix multiple indexing). I think the thing you want is:

for(n in 1:N)
  y[n] ~ poisson_log(latent_gp[row_i[n], col_j[n]]);
}

Hope that helps! (tried it on my computer with fake data and it ran)

edit: changed it’s to its :D

Patrick · June 24, 2017, 7:43pm

That did it! It is running really slow, but that is probably because of the size of the data set (I have only worked on smaller fake data so far). I will have to do some additional testing.

Thanks for your help!

Patrick · June 25, 2017, 6:21pm

I am still having a hard time sampling this model. I talked to some folks and they mentioned that the number of observations I have is what is driving the slow down.

I have about ~800,000 observations on a 50 by 50 grid. Is there anyway to vectorize the code that you suggested?

Thanks again for any help you can provide!

bgoodri · June 25, 2017, 6:25pm

vector[N] log_lambda;
for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]];
y ~ poisson_log(log_lambda);

ellie.nou13 · June 26, 2017, 4:36pm

Hi everyone,

I am new to stan and I have hard time running my code in matlab. Every time, I run it, I face this problem “Rejecting initial value:
Error evaluating the log probability at the initial value.” I put part of my code here. Any suggestion? Thank you in advance.

code = {
‘data {’
’ int<lower=0> T; ‘
’ int<lower=0> N; ‘
’ real<lower=0,upper=1> Fhat[N,T]; ‘
’ int<lower=0,upper=1> U[N,N]; ‘
’}’
‘parameters {’
’ simplex[N] M[T]; ‘
’}’
‘transformed parameters {’
’ real<lower=0,upper=1> F[N,T]; ‘
’ for (i in 1:T){’
’ for (n in 1:N){’
’ F[n,i]=0.0;’
’ for (k in 1:N){’
’ F[n,i]= F[n,i] + U[i,k]*M[i][k];’
’ }’
’ }’
’ }’
’}’
‘model {’
‘for (i in 1:T){’
’ M[i] ~ dirichlet([0.5, 0.5, 0.5, 0.5, 0.5, 0.5]’’);’
’}’
’ for(n in 1:N) {’
’ for (i in 1:T){’
’ Fhat[n,i] ~ normal(F[n,i],0.01);’
’ }’
’ }’
’}’
};
evol_dat = struct(‘T’,11,‘N’,6,…
‘Fhat’,Fdata,…
‘U’,Udata);
%%
model = StanModel(‘verbose’,true,‘model_code’,code,‘data’,evol_dat);
model.compile();

fit_vb = model.vb();

print(fit_vb);

Patrick · June 27, 2017, 12:42pm

Thanks Ben! I just had the slower run finish, so I will implement your suggestions. I tested your code with 100 draws. The gradient eval was cut by about half!

Patrick · July 7, 2017, 1:34pm

I am nearing the end of this gp model and I wanted to test my stan logic because I am getting some results I just do not understand. I am trying to add additional covariates that are independent of the gp. However when i add to the model there is a syntax error depending on how I write it. Here is the fully updated code:

  matrix gp_exp_quad_chol(real[] x, real alpha, real len, real jitter) {
    int dim_x = size(x);
    matrix[dim_x, dim_x] L_K_x;
    matrix[dim_x, dim_x] K_x = cov_exp_quad(x, alpha, len);
    for (n in 1:dim_x)
      K_x[n,n] = K_x[n,n] + jitter;
    L_K_x = cholesky_decompose(K_x);
    return L_K_x;
  }
}
data {
  // Model is for outcomes that have been
  // observed on a 2 dimensional grid
  int<lower=1> N; // Number of observations
  int<lower=1> K; // Number of predictors
  int<lower=1> dim_x_1; // size of grid dimension 1
  int<lower=1> dim_x_2; // size of grid dimension 2
  real x_1[dim_x_1]; // locations - 1st dimension
  real x_2[dim_x_2]; // locations - 2nd dimension
  int<lower=0> y[N];  // Outcome
  matrix[N, K] X; // predictor matrix

  int row_i[N]; // row position of each observation
  int col_j[N]; // col position of each observation
  

}
parameters {
  real w0; //intercept
  vector[K] beta; //PH predictors
  real<lower=0> len_scale_x_1; // length-scale - 1st dimension
  real<lower=0> len_scale_x_2; // length-scale - 2nd dimension
  real<lower=0> alpha; // Scale of outcomes
  // Standardized latent GP
  matrix[dim_x_1, dim_x_2] y_tilde;
}

transformed parameters {
matrix[dim_x_1, dim_x_2] latent_gp; 
  {
    matrix[dim_x_1, dim_x_1] L_K_x_1 = gp_exp_quad_chol(x_1, 1.0, len_scale_x_1, 1e-12);
    matrix[dim_x_2, dim_x_2] L_K_x_2 = gp_exp_quad_chol(x_2, alpha, len_scale_x_2, 1e-12);

    // latent_gp is matrix-normal with among-column covariance K_x_1
    // among-row covariance K_x_2
    
    latent_gp = L_K_x_1 * y_tilde * L_K_x_2';
  }
}
model {
  vector[N] log_lambda;
  
  // priors
  len_scale_x_1 ~ gamma(8, 2);
  len_scale_x_2 ~ gamma(8, 2);
  alpha ~ normal(0, 1);
  to_vector(y_tilde) ~ normal(0, 1);
  //non informative prior for the intercept
  w0 ~ normal(0,10);
  //Weakly informative prior for the other variables in data
  beta ~  student_t(7, 0, 2.5); 

  // likelihood 
   
  for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]]  +  w0  +  X[n] * beta;
  y ~ poisson_log(log_lambda);
}

However in log_lambda loop if I write the code this way I get an error:

for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]]  +  w0  + beta * X[n];

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

base type mismatch in assignment; variable name = log_lambda, type = real; right-hand side type=matrix

ERROR at line 68
67:
68: for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]] + w0 + beta * X[n];
^
69: y ~ poisson_log(log_lambda);

PARSER EXPECTED:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
one node produced an error: failed to parse Stan model ‘lgcp_testben’ due to the above error.

Am I missing something? I don’t see how the first formulation would run while the other throws an error.

Thanks!

bbbales2 · July 7, 2017, 3:00pm

X[n] is a row vector, and beta is a column vector, so if you want the dot product of these two it’s X[n] * beta.

beta * X[n] is a KxK matrix.

Hope that helps!

Patrick · July 8, 2017, 12:48am

Oops, simple linear algebra 101 mistake. Shouldn’t code on a empty stomach. Thanks ben!

Topic		Replies	Views
Multivariate Gaussian Process Regression Inference Modeling fitting-issues	8	1444	June 9, 2017
Gaussian Process: Rejecting initial value Modeling gaussian-process	4	837	November 4, 2017
Rejecting initial value: Error evaluating the log probability at the initial value Modeling fitting-issues	4	2633	July 7, 2018
Rejection Initial value Modeling	5	2047	July 3, 2017
Error evaluating the log probability at the initial value Modeling	7	1137	December 10, 2020

Gaussian process regression error (rejecting initial value)

Related topics