# Log probability evaluates to log(0), i.e. negative infinity

I am trying to write a bayesian APC model with second-order random walk priors assigned to age, cohort, and period effects in stan. However, after running the model, an error:
‘Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan can’t start sampling from this initial value.’
always appears. In my code, ‘log function’ is not used. So I cannot figure out the bugs in my code. Please help me to debug, I have been struggling in this program for quite a few days… Thank you!!

``````apc_model = "
data {
int<lower=1> I; // Number of observations
int<lower=0> cases [I] ; // Cases
vector[I] pyr; // Person-years at risk
vector age; // Age group indicators
vector coh; // Cohort group indicators
vector per; // Period group indicators
}
parameters {
vector<lower=-1000,upper=1000> alpha; // Age effects
vector<lower=-1000,upper=1000> gamma; // Cohort effects
vector<lower=-1000,upper=1000> beta; // Period effects
real<lower=1> sigma_a; // Standard deviation for age effects
real<lower=1> sigma_c; // Standard deviation for cohort effects
real<lower=1> sigma_p; // Standard deviation for period effects
}
transformed parameters {
vector lambda1;
vector lambda2;
vector lambda3;
real tau_a = pow(sigma_a, -2);
real tau_c = pow(sigma_c, -2);
real tau_p = pow(sigma_p, -2);

lambda1 = alpha .* age;
lambda2 = coh .* gamma;
lambda3 = per .* beta;
}

model {
matrix[16, 9] mu;
for (a in 1:16) {
for (p in 1:9) {
mu[a, p] = 100000 * exp(alpha[a] + beta[p] + gamma[p - a + 16]);
// Likelihood
}
}
cases ~ poisson(to_row_vector(mu));

// Age effects
alpha ~ normal(0, sqrt(1e-6 * tau_a));
alpha ~ normal(0, sqrt(1e-6 * tau_a));
for (a in 3:16) {
alpha[a] ~ normal(2 * alpha[a - 1] - alpha[a - 2], sqrt(tau_a));
}

// Cohort effects
gamma ~ normal(0, sqrt(1e-6 * tau_c));
gamma ~ normal(0, sqrt(1e-6 * tau_c));
for (c in 3:23) {
gamma[c] ~ normal(2 * gamma[c - 1] - gamma[c - 2], sqrt(tau_c));
}

// Period effects
beta ~ normal(0, sqrt(1e-6 * tau_p));
beta ~ normal(0, sqrt(1e-6 * tau_p));
for (p in 3:9) {
beta[p] ~ normal(2 * beta[p - 1] - beta[p - 2], sqrt(tau_p));
}

// Priors for standard deviations
sigma_a ~ uniform(1, 1000);
sigma_c ~ uniform(1, 1000);
sigma_p ~ uniform(1, 1000);
}"

``````

The error message:

``````SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
 "Error : Initialization failed."
 "error occurred during calling the sampler; sampling not done"

``````

Even though you are not explicitly computing the log of probability, Stan will compute the total likelihood as the sum of log of likelihoods (the alternative is computing product of likelihoods as is much more prone to numeric instabilities). So what is likely happening is the randomly-chosen parameters from the chain have very low probability which underflows to zero (and it’s associated negative infinity log); with that the chain cannot go anywhere – instead you need a finite probability, however small. The easiest way to test if that’s really what is happening is choosing some fixed parameter values (and later relaxing that to make sure chains are converging to the correct region). You can also use simulated data to try to test that, or see if there are any bugs in the implementation.

2 Likes