# Rejecting initial value - Gradient evaluated at the initial value is not finite

I’m trying to fit a hierachical model described by this paper, A Bayesian spatio-temporal model of COVID-19 spread in England | Scientific Reports. Yet I can’t seem to figure out this message:
Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can’t start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

I’ve tried initialising everything to 0 at the start, and printing each of the parameters, but none of them are `nan`, so I’m not too sure what’s going on. Model code is below. Any help would be greatly appreciated.

``````data {
int<lower=0> s; // number of spatial locations
int<lower=0> t; // number of time points
int<lower=0> k; // number of covariates

vector<lower=0>[t*s] N; // population

int<lower=0> y[t*s]; // response data
matrix[s, k] X; // dependent data

cov_matrix[s] big_sigma_w;
}

parameters {
vector[k] beta; // covariates
vector[s] w[t]; // spatial random effect

vector[s] xi_start; // first value of xi

real<lower=0> sigma_w; // Spatial variance parameter
real<lower=-1, upper=1> alpha; // Temporal dependence parameter
}

transformed parameters {

// Transform
vector[s] N_vec[t];
for (i in 1:t) {
N_vec[i] = to_vector(N[((i - 1) * s + 1):(i * s)]);
}

vector[s] xi[t]; // spatio-temporal random effect
xi[1] = xi_start; // Assign first value

if (t > 1){
for (i in 2:t){
xi[i] = alpha * xi[i-1] + w[i];
}
}

vector[s] log_theta[t]; // infection risk
vector[s] mu[t]; // mean

for(i in 1:t){
log_theta[i] = X * beta + xi[i];
mu[i] = N_vec[i] .* exp(log_theta[i]);
}

vector[t*s] mu_flat;

for (i in 1:t) {
mu_flat[((i - 1) * s + 1):(i * s)] = mu[i];
}

}

model {

// Penalized complexity prior for sigma_w using Half-Normal
// This ensures p(sigma_w >1) = 0.01
sigma_w ~ normal(0, 0.43); // Adjust the standard deviation as needed

// // Apply Beta prior to bound_alpha, which is transformed to alpha
// // This ensures p(alpha > 0) = 0.9
// bound_alpha ~ beta(2, 20);

xi_start ~ normal(0, sigma_w^2 / (1 - alpha^2));

beta ~ normal(0, 1000);

w ~ multi_normal(rep_vector(0, s), sigma_w^2*big_sigma_w);

for (i in 1:t*s){
y[i] ~ poisson(mu_flat[i]);
}

for (i in 1:t*s){
print("mu_flat[", i, "] = ", mu_flat[i]);
print("y[", i, "] = ", y[i]);
}

// for(i in 1:k){
//   print("beta[", i, "]", beta[k]);
// }

// for( i in 1:t){
//   print("w[", i, "]", w[i]);
// }

// print("sigma_w = ", sigma_w);
// print("alpha = ", alpha);

}

generated quantities {
vector[t*s] fitted_values;

for (i in 1:t*s) {
fitted_values[i] = poisson_rng(mu_flat[i]);
}
}
``````

Trying with the full dataset gives this error:
Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan can’t start sampling from this initial value.

Thanks all

1 Like

The problem is reported with the gradient, which may be problematic even if the values all look good. Unfortunately, debugging gradients is not really possible in any direct way.

I don’t see anything obviously wrong with the model. So I think your best bet is to try to simplify the model to see which part introduces the issues (e.g. by replacing some parameters with constants/moving them to data)

Best of luck with your model!

1 Like