# Gaussian process regression error (rejecting initial value)

Hello Stan users,

I am trying to re-purpose some of Rob Trangucci’s excellent code on Gaussian processes but I am getting some errors and I am not sure what to do https://github.com/rtrangucci/multi_output_gps

Rejecting initial value:
Error evaluating the log probability at the initial value.

Here is the modified code. I originally commented out some of the program thinking that was the issue but I am still getting the same error. The big change that I am working on is that I need a Poisson model where I can observe covariate level data at the observation level. The end result being that I would have observation level effects that are independent of the GP grid.

``````functions {
matrix gp_exp_quad_chol(real[] x, real alpha, real len, real jitter) {
int dim_x = size(x);
matrix[dim_x, dim_x] L_K_x;
matrix[dim_x, dim_x] K_x = cov_exp_quad(x, alpha, len);
for (n in 1:dim_x)
K_x[n,n] = K_x[n,n] + jitter;
L_K_x = cholesky_decompose(K_x);
return L_K_x;
}
}
data {
// Model is for outcomes that have been
// observed on a 2 dimensional grid
int<lower=1> N; // Number of observations
int<lower=1> K; // Number of predictors
int<lower=1> dim_x_1; // size of grid dimension 1
int<lower=1> dim_x_2; // size of grid dimension 2
real x_1[dim_x_1]; // locations - 1st dimension
real x_2[dim_x_2]; // locations - 2nd dimension
int<lower=0> y[N];  // Outcome
//matrix[N, K] X; // predictor matrix

int row_i[N]; // row position of each observation
int col_j[N]; // col position of each observation

}
parameters {
//real w0; //intercept
//vector[K] beta; //PH predictors
real<lower=0> len_scale_x_1; // length-scale - 1st dimension
real<lower=0> len_scale_x_2; // length-scale - 2nd dimension
real<lower=0> alpha; // Scale of outcomes

// Standardized latent GP
matrix[dim_x_1, dim_x_2] y_tilde;
}
model {
matrix[dim_x_1, dim_x_2] latent_gp;
{
matrix[dim_x_1, dim_x_1] L_K_x_1 = gp_exp_quad_chol(x_1, 1.0, len_scale_x_1, 1e-12);
matrix[dim_x_2, dim_x_2] L_K_x_2 = gp_exp_quad_chol(x_2, alpha, len_scale_x_2, 1e-12);

// latent_gp is matrix-normal with among-column covariance K_x_1
// among-row covariance K_x_2

latent_gp = L_K_x_1 * y_tilde * L_K_x_2';
}
// priors
len_scale_x_1 ~ gamma(8, 2);
len_scale_x_2 ~ gamma(8, 2);
alpha ~ normal(0, 1);
to_vector(y_tilde) ~ normal(0, 1);
// weakly informative prior for the intercept
//w0 ~ normal(0,10);
//Weakly informative prior for the other variables in data
//beta ~  normal(0,10); //w0 + X * beta +

// likelihood
y ~ poisson_log(to_vector(latent_gp[row_i, col_j]'));
}
``````

-Patrick

I think your problem is with this:

``````y ~ poisson_log(to_vector(latent_gp[row_i, col_j]'));
``````

The multiple indexing like that returns a matrix, not an array (I think this copies after how R does its matrix multiple indexing). I think the thing you want is:

``````for(n in 1:N)
y[n] ~ poisson_log(latent_gp[row_i[n], col_j[n]]);
}
``````

Hope that helps! (tried it on my computer with fake data and it ran)

edit: changed it’s to its :D

That did it! It is running really slow, but that is probably because of the size of the data set (I have only worked on smaller fake data so far). I will have to do some additional testing.

1 Like

I am still having a hard time sampling this model. I talked to some folks and they mentioned that the number of observations I have is what is driving the slow down.

I have about ~800,000 observations on a 50 by 50 grid. Is there anyway to vectorize the code that you suggested?

``````vector[N] log_lambda;
for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]];
y ~ poisson_log(log_lambda);
``````
1 Like

Hi everyone,

I am new to stan and I have hard time running my code in matlab. Every time, I run it, I face this problem “Rejecting initial value:
Error evaluating the log probability at the initial value.” I put part of my code here. Any suggestion? Thank you in advance.

code = {
‘data {’
’ int<lower=0> T; ‘
’ int<lower=0> N; ‘
’ real<lower=0,upper=1> Fhat[N,T]; ‘
’ int<lower=0,upper=1> U[N,N]; ‘
’}’
‘parameters {’
’ simplex[N] M[T]; ‘
’}’
‘transformed parameters {’
’ real<lower=0,upper=1> F[N,T]; ‘
’ for (i in 1:T){’
’ for (n in 1:N){’
’ F[n,i]=0.0;’
’ for (k in 1:N){’
’ F[n,i]= F[n,i] + U[i,k]*M[i][k];’
’ }’
’ }’
’ }’
’}’
‘model {’
‘for (i in 1:T){’
’ M[i] ~ dirichlet([0.5, 0.5, 0.5, 0.5, 0.5, 0.5]’’);’
’}’
’ for(n in 1:N) {’
’ for (i in 1:T){’
’ Fhat[n,i] ~ normal(F[n,i],0.01);’
’ }’
’ }’
’}’
};
evol_dat = struct(‘T’,11,‘N’,6,…
‘Fhat’,Fdata,…
‘U’,Udata);
%%
model = StanModel(‘verbose’,true,‘model_code’,code,‘data’,evol_dat);
model.compile();

fit_vb = model.vb();

print(fit_vb);

Thanks Ben! I just had the slower run finish, so I will implement your suggestions. I tested your code with 100 draws. The gradient eval was cut by about half!

1 Like

I am nearing the end of this gp model and I wanted to test my stan logic because I am getting some results I just do not understand. I am trying to add additional covariates that are independent of the gp. However when i add to the model there is a syntax error depending on how I write it. Here is the fully updated code:

``````  matrix gp_exp_quad_chol(real[] x, real alpha, real len, real jitter) {
int dim_x = size(x);
matrix[dim_x, dim_x] L_K_x;
matrix[dim_x, dim_x] K_x = cov_exp_quad(x, alpha, len);
for (n in 1:dim_x)
K_x[n,n] = K_x[n,n] + jitter;
L_K_x = cholesky_decompose(K_x);
return L_K_x;
}
}
data {
// Model is for outcomes that have been
// observed on a 2 dimensional grid
int<lower=1> N; // Number of observations
int<lower=1> K; // Number of predictors
int<lower=1> dim_x_1; // size of grid dimension 1
int<lower=1> dim_x_2; // size of grid dimension 2
real x_1[dim_x_1]; // locations - 1st dimension
real x_2[dim_x_2]; // locations - 2nd dimension
int<lower=0> y[N];  // Outcome
matrix[N, K] X; // predictor matrix

int row_i[N]; // row position of each observation
int col_j[N]; // col position of each observation

}
parameters {
real w0; //intercept
vector[K] beta; //PH predictors
real<lower=0> len_scale_x_1; // length-scale - 1st dimension
real<lower=0> len_scale_x_2; // length-scale - 2nd dimension
real<lower=0> alpha; // Scale of outcomes
// Standardized latent GP
matrix[dim_x_1, dim_x_2] y_tilde;
}

transformed parameters {
matrix[dim_x_1, dim_x_2] latent_gp;
{
matrix[dim_x_1, dim_x_1] L_K_x_1 = gp_exp_quad_chol(x_1, 1.0, len_scale_x_1, 1e-12);
matrix[dim_x_2, dim_x_2] L_K_x_2 = gp_exp_quad_chol(x_2, alpha, len_scale_x_2, 1e-12);

// latent_gp is matrix-normal with among-column covariance K_x_1
// among-row covariance K_x_2

latent_gp = L_K_x_1 * y_tilde * L_K_x_2';
}
}
model {
vector[N] log_lambda;

// priors
len_scale_x_1 ~ gamma(8, 2);
len_scale_x_2 ~ gamma(8, 2);
alpha ~ normal(0, 1);
to_vector(y_tilde) ~ normal(0, 1);
//non informative prior for the intercept
w0 ~ normal(0,10);
//Weakly informative prior for the other variables in data
beta ~  student_t(7, 0, 2.5);

// likelihood

for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]]  +  w0  +  X[n] * beta;
y ~ poisson_log(log_lambda);
}

``````

However in log_lambda loop if I write the code this way I get an error:

``````for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]]  +  w0  + beta * X[n];
``````

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

base type mismatch in assignment; variable name = log_lambda, type = real; right-hand side type=matrix

ERROR at line 68
67:
68: for (n in 1:N) log_lambda[n] = latent_gp[row_i[n], col_j[n]] + w0 + beta * X[n];
^
69: y ~ poisson_log(log_lambda);

PARSER EXPECTED:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
one node produced an error: failed to parse Stan model ‘lgcp_testben’ due to the above error.

Am I missing something? I don’t see how the first formulation would run while the other throws an error.

Thanks!

`X[n]` is a row vector, and `beta` is a column vector, so if you want the dot product of these two it’s `X[n] * beta`.

`beta * X[n]` is a KxK matrix.

Hope that helps!

Oops, simple linear algebra 101 mistake. Shouldn’t code on a empty stomach. Thanks ben!