Initial value rejected

Hello, I have this error message when I try sampling. I use Rstan and the code is as follows. sorry I cant share my code, but the following is the fake code that I created but very similar.
Any suggestions are appreciated. Thanks.

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

modelstring="
data{
int<lower=0>  T;
int<lower=0>  N;
vector<lower=0> [N] score;
int<lower=0>  A[N, T];
int<lower=0>  trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}

parameters{
matrix[1,2] beta;
vector<lower=0> [T] d;
matrix[2,8] ran_trial;
vector<lower=0> [2] s_trial;
}

model {
for (o in 1:2){
beta[1,o] ~ normal(0,100);
s_trial[o] ~ student_t(2,0,1)T[0,];
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]); 
}}

for(j in 1:T) {
d[j] ~ gamma(0.0001, 0.001);
for (i in 1:N){
y1[i,j] ~ poisson(A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d[j]);
y2[i,j] ~ poisson(A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d[j]);
}}
}
"
sm = stan_model(model_code=modelstring)
stan_data = list(score=mydata$score,trial=as.numeric(mydata$trial),
                 y1=y1,y2=y2,A=A, N=N, T=T)
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2)

[edit: auto-indented code]

I don’t enough technical knowledge to understand your model, but I would first do what the error message suggests and set the init_r parameter lower then 2 and see what you get.

 fit = sampling(
sm, 
data=stan_data, 
chains=3,
iter=600,
warmup=200,
thin=2, 
init_r = .1
)

On the other hand, I would look for parameters that might need to have specified lower or upper boundaries and set them up on the pare meters block.

Finally, on my modeling experience, sometimes I have models where a parameter value cannot exceed values of the data. For these cases, I set the the initial values to a very small to avoid any issues. Might not be the case in your model, but there might be parameters that if initialized in a very extreme value might make the whole sum inside the poisson distribution negative. Which might be the source of your problem.

You need a <lower = 0> bound on d, and you also need some way of ensuring that A[i,j] +beta[1,1]* score[i] +ran_trial[1:2,trial[i]]+d is greater than zero. Perhaps a log link function?

1 Like

Thanks for your suggestion. I just realized I used the wrong model, and I just fixed my post. The updated model is as follows. I did put the <lower=0> for d. It is weird to me because A is non negative, and exp() wont give me negative number, and d is also positive. For any reasons, it won’t give me negative number inside Poisson.

y1[i,j] ~ poisson(A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d);
y2[i,j] ~ poisson(A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d);
 

The Stan code is still not grammatical. I auto-indented and you can see the model block is closed before the end of the statements.

In an unrelated comment, we strongly disrecommend those gamma priors—see the Stan wiki page on recommended priors (easy to find with search).

To ensure d stays positive, it must be declared with <lower = 0>. If it’s not, then initialization will be between (-2, 2) and sampling can step below zero. If you have a big vector, it’s only 50% likely each value is drawn positive, so that’ll prevent initialization.

Thanks, Bob. I did declare d to be positive. Well, for the gamma priors, I have to stick with it, because that’s what I am asked to use.
I tried to give the initial values for beta, and did the following. but still have this initial value rejected error.

init_beta= matrix(NA,1,2)
for (i in 1:2) {
init_beta[1,i]= rnorm(1,0,1)
}

fit= sampling(sm, data=stan_data, chains=3,iter=60,warmup=20,thin=2,init=init_beta)

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

I tried to assign initial values, and in my model the parameter that I care about is beta, and ran_trial is random effect, so I guess I don’t have to give initial values to it, correct? but I still have this same initial value rejected error.

modelstring="
data{
int<lower=0>  T;
int<lower=0>  N;
vector<lower=0> [N] score;
int<lower=0>  A[N, T];
int<lower=0>  trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}

parameters{
matrix[1,2] beta;
vector<lower=0> [T] d;
matrix[2,8] ran_trial;
vector<lower=0> [2] s_trial;
}

model {
for (o in 1:2){
beta[1,o] ~ normal(0,100);
s_trial[o] ~ student_t(2,0,1)T[0,];
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]); 
}}

for(j in 1:T) {
d[j] ~ gamma(0.0001, 0.001);
for (i in 1:N){
y1[i,j] ~ poisson(A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d[j]);
y2[i,j] ~ poisson(A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d[j]);
}}
}
"
sm = stan_model(model_code=modelstring)
stan_data = list(score=mydata$score,trial=as.numeric(mydata$trial),
                 y1=y1,y2=y2,A=A, N=N, T=T)
init_beta= matrix(NA,1,4)
for (i in 1:2) {
init_beta[1,i]=rnorm(1,0,2)
}
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2,init=init_beta)

Not necessarily. Try setting initial values for ran_trial and see what happens.

I tried the following code to give initials to beta. I didn’t find a similar example online. Do you mind explaining how to tell Stan to set initials to multiple parameters with some of them as matrix/vectors? Thanks.

init_beta= matrix(NA,1,4)
for (i in 1:2) {
init_beta[1,i]=rnorm(1,0,2)
}
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2,init=init_beta)

And that code is completely wrong. The init argument should not be set to a matrix.

I strongly suggest that you read the documentation for RStan, especially the part on “inits via function” (which is actually somewhat easier than using “inits via list”). Look at the examples of initialization in the “Examples” section of the documentation of the stan function.

Thanks. I just revised the code based on the Stan manual, and set initial values to beta and ran_trial as 0s, or follow normal distributions. All gave me the same warning messages. OK I am stuck. :(

n_chains <- 3
initf <- function(chain_id = 1) {
  list(ran_trial=array(rep(0,16), dim = c(2,8)),beta = array(rep(0,2), dim = c(1,2)))
}
init_ll <- lapply(1:n_chains, function(id) initf(chain_id = id))
fit= sampling(sm, data=stan_data, chains=3,iter=60,warmup=20,thin=2,init=init_beta)
  1. You don’t need chain_id = 1 in your definition of initf.
  2. You don’t need init_ll at all.
  3. In the arguments to the sampling function, init_beta should be replaced with initf.

Oh, right sorry that I pasted the old code, I used stan init=init_f instead of init_beta.
Then I revised based on your suggestions 1 and 2 as follows. still the same error. Is it something wrong with my model, or data? I am confused. Maybe I should try a different dataset to see how it goes.

n_chains <- 3
initf <- function() {
  list(ran_trial=array(rep(0,16), dim = c(2,8)),beta = array(rep(0,2), dim = c(1,2)))
}
fit= sampling(sm, data=stan_data, chains=3,iter=60,warmup=20,thin=2,init=init_f)

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

So while chasing good initial values can be fun, I find it gets old fast. Here’s an alternative:

  1. Stan does random inits on a radius around zero.
  2. Your model should be able to start with all parameters set to zero.
  3. It’s really convenient if the model can also start within a radius (+/-2) around zero.

Often that means re-writing your model and maybe adding some offsets or transformations to your code. If you look at the diagnostic output, especially the momenta (p_* in the CmdStan diagnostic file) and gradients (g_* in the CmdStan diagnostic file) they will point you to the specific individual parameters that have non-finite gradients. That gets you a procedure for fixing your model: check how those specific parameters with non-finite gradients are being used in the density calculations and modify it.

You can get these diagnostics without even running your model by trying the “diagnose” option (rather than “sampling”)… I don’t really remember how those are made available in rstan

1 Like

You have to define your Stan program so that any values of the parameters meeting the declared constraints has a finite log density.

See the manual for whatever interface you’re using.

But usually you shouldn’t need to initialize if the constraints are defined properly.

Maybe you can point whoever is doing the asking to Andrew’s paper explaining why they are much stronger priors than people think. Usually people use them thinking they’re very non-informative, which is not the case.

Thanks for your reply.
I tried to set all parameters zero as initials, but still got this error:

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan can’t start sampling from this initial value.

Actually the same model is written in JAGS, and it runs with no problems. I am thinking if it is because of the way how Stan sample. I can succeed to compile the model, but cannot do sampling with my data. “y1, y2 and A” in my matrix contains only 0s and 1s. I am thinking if those 0s give Stan a hard time to sample? Should I try to replace those 0s by a very small number?

I was not suggesting chanigng initial values, you need to write your model s.t. you calculate get a finite density with (internal) initial values set to zero. So any unconstrained parameters will be zero at initialization and any constrained parameters with <lower=0> will be 1 at initialization.

One of your parameters, s_trial, is a standard deviation and really should not be set to zero.

I added the transformed parameters block. No, still same error. :(

modelstring="
data{
int<lower=0>  T;
int<lower=0>  N;
vector<lower=0> [N] score;
int<lower=0>  A[N, T];
int<lower=0>  trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}

parameters{
matrix[1,2] beta;
vector<lower=0> [T] d;
matrix[2,8] ran_trial;
vector<lower=0> [2] s_trial;
}
transformed parameters{
for(j in 1:T) {
for (i in 1:N){
matrix[N, T] p1;
matrix[N, T] p2;
p1[i,j]=A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d[j];
p2[i,j]=A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d[j];
}}
}

model {
for (o in 1:2){
beta[1,o] ~ normal(0,100);
s_trial[o] ~ student_t(2,0,1)T[0,];
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]); 
}}

for(j in 1:T) {
d[j] ~ gamma(0.0001, 0.001);
for (i in 1:N){

if (A[i,j]==1){

y1[i,j] ~ poisson(p1[i,j]);
y2[i,j] ~ poisson(p2[i,j]);

}

}}
}
"

Can you please tell me which paper of Andrew’s are you referring here?