Initial value rejected

Lilia_Feng · July 31, 2018, 6:44pm

Hello, I have this error message when I try sampling. I use Rstan and the code is as follows. sorry I cant share my code, but the following is the fake code that I created but very similar.
Any suggestions are appreciated. Thanks.
…
Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

modelstring="
data{
int<lower=0>  T;
int<lower=0>  N;
vector<lower=0> [N] score;
int<lower=0>  A[N, T];
int<lower=0>  trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}

parameters{
matrix[1,2] beta;
vector<lower=0> [T] d;
matrix[2,8] ran_trial;
vector<lower=0> [2] s_trial;
}

model {
for (o in 1:2){
beta[1,o] ~ normal(0,100);
s_trial[o] ~ student_t(2,0,1)T[0,];
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]); 
}}

for(j in 1:T) {
d[j] ~ gamma(0.0001, 0.001);
for (i in 1:N){
y1[i,j] ~ poisson(A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d[j]);
y2[i,j] ~ poisson(A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d[j]);
}}
}
"
sm = stan_model(model_code=modelstring)
stan_data = list(score=mydata$score,trial=as.numeric(mydata$trial),
                 y1=y1,y2=y2,A=A, N=N, T=T)
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2)

[edit: auto-indented code]

tiagocc · July 31, 2018, 8:14pm

I don’t enough technical knowledge to understand your model, but I would first do what the error message suggests and set the init_r parameter lower then 2 and see what you get.

 fit = sampling(
sm, 
data=stan_data, 
chains=3,
iter=600,
warmup=200,
thin=2, 
init_r = .1
)

On the other hand, I would look for parameters that might need to have specified lower or upper boundaries and set them up on the pare meters block.

Finally, on my modeling experience, sometimes I have models where a parameter value cannot exceed values of the data. For these cases, I set the the initial values to a very small to avoid any issues. Might not be the case in your model, but there might be parameters that if initialized in a very extreme value might make the whole sum inside the poisson distribution negative. Which might be the source of your problem.

hhau · August 1, 2018, 1:09am

You need a <lower = 0> bound on d, and you also need some way of ensuring that A[i,j] +beta[1,1]* score[i] +ran_trial[1:2,trial[i]]+d is greater than zero. Perhaps a log link function?

Lilia_Feng · August 1, 2018, 6:36pm

Thanks for your suggestion. I just realized I used the wrong model, and I just fixed my post. The updated model is as follows. I did put the <lower=0> for d. It is weird to me because A is non negative, and exp() wont give me negative number, and d is also positive. For any reasons, it won’t give me negative number inside Poisson.

y1[i,j] ~ poisson(A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d);
y2[i,j] ~ poisson(A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d);

Bob_Carpenter · August 1, 2018, 7:50pm

The Stan code is still not grammatical. I auto-indented and you can see the model block is closed before the end of the statements.

In an unrelated comment, we strongly disrecommend those gamma priors—see the Stan wiki page on recommended priors (easy to find with search).

To ensure d stays positive, it must be declared with <lower = 0>. If it’s not, then initialization will be between (-2, 2) and sampling can step below zero. If you have a big vector, it’s only 50% likely each value is drawn positive, so that’ll prevent initialization.

Lilia_Feng · August 1, 2018, 8:43pm

Thanks, Bob. I did declare d to be positive. Well, for the gamma priors, I have to stick with it, because that’s what I am asked to use.
I tried to give the initial values for beta, and did the following. but still have this initial value rejected error.

init_beta= matrix(NA,1,2)
for (i in 1:2) {
init_beta[1,i]= rnorm(1,0,1)
}

fit= sampling(sm, data=stan_data, chains=3,iter=60,warmup=20,thin=2,init=init_beta)

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

Lilia_Feng · August 2, 2018, 4:31pm

I tried to assign initial values, and in my model the parameter that I care about is beta, and ran_trial is random effect, so I guess I don’t have to give initial values to it, correct? but I still have this same initial value rejected error.

modelstring="
data{
int<lower=0>  T;
int<lower=0>  N;
vector<lower=0> [N] score;
int<lower=0>  A[N, T];
int<lower=0>  trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}

parameters{
matrix[1,2] beta;
vector<lower=0> [T] d;
matrix[2,8] ran_trial;
vector<lower=0> [2] s_trial;
}

model {
for (o in 1:2){
beta[1,o] ~ normal(0,100);
s_trial[o] ~ student_t(2,0,1)T[0,];
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]); 
}}

for(j in 1:T) {
d[j] ~ gamma(0.0001, 0.001);
for (i in 1:N){
y1[i,j] ~ poisson(A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d[j]);
y2[i,j] ~ poisson(A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d[j]);
}}
}
"
sm = stan_model(model_code=modelstring)
stan_data = list(score=mydata$score,trial=as.numeric(mydata$trial),
                 y1=y1,y2=y2,A=A, N=N, T=T)
init_beta= matrix(NA,1,4)
for (i in 1:2) {
init_beta[1,i]=rnorm(1,0,2)
}
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2,init=init_beta)

jjramsey · August 2, 2018, 5:24pm

Not necessarily. Try setting initial values for ran_trial and see what happens.

Lilia_Feng · August 2, 2018, 5:34pm

I tried the following code to give initials to beta. I didn’t find a similar example online. Do you mind explaining how to tell Stan to set initials to multiple parameters with some of them as matrix/vectors? Thanks.

init_beta= matrix(NA,1,4)
for (i in 1:2) {
init_beta[1,i]=rnorm(1,0,2)
}
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2,init=init_beta)

jjramsey · August 2, 2018, 6:16pm

And that code is completely wrong. The init argument should not be set to a matrix.

I strongly suggest that you read the documentation for RStan, especially the part on “inits via function” (which is actually somewhat easier than using “inits via list”). Look at the examples of initialization in the “Examples” section of the documentation of the stan function.

Lilia_Feng · August 2, 2018, 8:05pm

Thanks. I just revised the code based on the Stan manual, and set initial values to beta and ran_trial as 0s, or follow normal distributions. All gave me the same warning messages. OK I am stuck. :(

n_chains <- 3
initf <- function(chain_id = 1) {
  list(ran_trial=array(rep(0,16), dim = c(2,8)),beta = array(rep(0,2), dim = c(1,2)))
}
init_ll <- lapply(1:n_chains, function(id) initf(chain_id = id))
fit= sampling(sm, data=stan_data, chains=3,iter=60,warmup=20,thin=2,init=init_beta)

jjramsey · August 2, 2018, 8:16pm

You don’t need chain_id = 1 in your definition of initf.
You don’t need init_ll at all.
In the arguments to the sampling function, init_beta should be replaced with initf.

Lilia_Feng · August 2, 2018, 8:31pm

Oh, right sorry that I pasted the old code, I used stan init=init_f instead of init_beta.
Then I revised based on your suggestions 1 and 2 as follows. still the same error. Is it something wrong with my model, or data? I am confused. Maybe I should try a different dataset to see how it goes.

n_chains <- 3
initf <- function() {
  list(ran_trial=array(rep(0,16), dim = c(2,8)),beta = array(rep(0,2), dim = c(1,2)))
}
fit= sampling(sm, data=stan_data, chains=3,iter=60,warmup=20,thin=2,init=init_f)

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

sakrejda · August 2, 2018, 9:01pm

So while chasing good initial values can be fun, I find it gets old fast. Here’s an alternative:

Stan does random inits on a radius around zero.
Your model should be able to start with all parameters set to zero.
It’s really convenient if the model can also start within a radius (+/-2) around zero.

Often that means re-writing your model and maybe adding some offsets or transformations to your code. If you look at the diagnostic output, especially the momenta (p_* in the CmdStan diagnostic file) and gradients (g_* in the CmdStan diagnostic file) they will point you to the specific individual parameters that have non-finite gradients. That gets you a procedure for fixing your model: check how those specific parameters with non-finite gradients are being used in the density calculations and modify it.

You can get these diagnostics without even running your model by trying the “diagnose” option (rather than “sampling”)… I don’t really remember how those are made available in rstan…

Bob_Carpenter · August 2, 2018, 10:58pm

You have to define your Stan program so that any values of the parameters meeting the declared constraints has a finite log density.

See the manual for whatever interface you’re using.

But usually you shouldn’t need to initialize if the constraints are defined properly.

Maybe you can point whoever is doing the asking to Andrew’s paper explaining why they are much stronger priors than people think. Usually people use them thinking they’re very non-informative, which is not the case.

Lilia_Feng · August 6, 2018, 5:16pm

Thanks for your reply.
I tried to set all parameters zero as initials, but still got this error:

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan can’t start sampling from this initial value.

Actually the same model is written in JAGS, and it runs with no problems. I am thinking if it is because of the way how Stan sample. I can succeed to compile the model, but cannot do sampling with my data. “y1, y2 and A” in my matrix contains only 0s and 1s. I am thinking if those 0s give Stan a hard time to sample? Should I try to replace those 0s by a very small number?

sakrejda · August 6, 2018, 6:01pm

I was not suggesting chanigng initial values, you need to write your model s.t. you calculate get a finite density with (internal) initial values set to zero. So any unconstrained parameters will be zero at initialization and any constrained parameters with <lower=0> will be 1 at initialization.

jjramsey · August 6, 2018, 7:04pm

One of your parameters, s_trial, is a standard deviation and really should not be set to zero.

Lilia_Feng · August 6, 2018, 7:45pm

I added the transformed parameters block. No, still same error. :(

modelstring="
data{
int<lower=0>  T;
int<lower=0>  N;
vector<lower=0> [N] score;
int<lower=0>  A[N, T];
int<lower=0>  trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}

parameters{
matrix[1,2] beta;
vector<lower=0> [T] d;
matrix[2,8] ran_trial;
vector<lower=0> [2] s_trial;
}
transformed parameters{
for(j in 1:T) {
for (i in 1:N){
matrix[N, T] p1;
matrix[N, T] p2;
p1[i,j]=A[i,j]*exp(beta[1,1]* score[i] +ran_trial[1,trial[i]])*d[j];
p2[i,j]=A[i,j]*exp(beta[1,2]* score[i] +ran_trial[2,trial[i]])*d[j];
}}
｝

model {
for (o in 1:2){
beta[1,o] ~ normal(0,100);
s_trial[o] ~ student_t(2,0,1)T[0,];
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]); 
}}

for(j in 1:T) {
d[j] ~ gamma(0.0001, 0.001);
for (i in 1:N){

if (A[i,j]==1){

y1[i,j] ~ poisson(p1[i,j]);
y2[i,j] ~ poisson(p2[i,j]);

}

}}
}
"

Murali_037 · March 10, 2020, 3:14pm

Can you please tell me which paper of Andrew’s are you referring here?

Topic		Replies	Views
Initialization failed, initial values rejected Modeling	10	2381	October 17, 2018
Survival analysis: initial values rejected Modeling	23	1628	August 11, 2018
Failure to start because of initial values Modeling	16	3494	July 31, 2017
Error rejecting initial value but have looked at lower bounds of priors Modeling rstan , fitting-issues , specification	11	617	May 2, 2020
Error in rejecting initial values in Rstan Modeling rstan , techniques , fitting-issues	2	650	August 20, 2020

Initial value rejected

Related topics