How to debug "Gradient evaluated at the initial value is not finite."

tek · June 13, 2025, 12:52pm

Hi,

I’m attempting to adapt a colleagues JAGS model to Stan that estimates an infection time series across multiple sites. The Stan version has made it to the stage where it compiles, but fails with an infinite gradient error:

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can’t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.

What would be the best way to move forward with debugging the model? Is there a way to figure out which call is causing the infinite gradient? I tried using the test_gradient option and the diagnose method from cmdstanr, but they only give the setup options, not any of the intermediate output.

Here is the code. I apologize that it is on the longer side, I don’t know how to meaningfully simplify it. I could cut it down to a single site if that would help.

data {
  int<lower=1> J;
  int<lower=1> I;
  array[I,J] int<lower=0> attendees;
  array[I,J] int<lower=0> fever;
  array[I,J] int<lower=0> tested;
  array[I,J] int<lower=0> Pf_confirmed;
  array[J] int<lower=1> N2;
}



parameters {
  real alpha1_Pf;
  real alpha2_Pf;
  real <lower=.7,upper=.9>f_Pf;
  real <lower=0,upper=1>s;
  array[J] real <lower=0,upper=1> r;
  array[J] real beta0_test;
  array[J] real beta0_att;
  array[I,J] real <lower=0> Endemic_Pf;
  
  
  

  array[J] real<lower=0> alpha0_Pf;
  array[J] real<lower=0,upper=10> RC_Pf1; 
  array[J]  real<lower=0,upper=1> lambda_E_Pf1;

}

transformed parameters {
  array[J] real P_ATT;
  array[J] real P_TEST;
  array[I,J] real<lower=0> pconfirm_Pf;
  array[I,J] real <lower=0> lambda_attending;
  array[I,J] real <lower=0> lambda_fever;
  array[I,J] real RC_Pf;
  
  array[I,J] real <lower=0> lambda_E_Pf;
  
  for(j in 1:J){
    P_ATT[j]=inv_logit(beta0_att[j]);
    P_TEST[j]=inv_logit(beta0_test[j]);
    RC_Pf[1,j]= RC_Pf1[j];
    lambda_E_Pf[1,j]=lambda_E_Pf1[j];
  }
  
  
  for(i in 1:I){
    for(j in 1:J){
      RC_Pf[i,j] = Endemic_Pf[i,j];
      lambda_attending[i,j] = (r[j]*(N2[j]-RC_Pf[i,j]) + f_Pf*RC_Pf[i,j]) * (P_ATT[j]);
      lambda_fever[i,j] = (s*r[j]*(N2[j]-RC_Pf[i,j]) + f_Pf*RC_Pf[i,j] );
      pconfirm_Pf[i,j] = (P_TEST[j]*f_Pf*RC_Pf[i,j]);
      
    }
  }
  
  for(i in 2:I){
    for(j in 1:J){
      lambda_E_Pf[i,j] = exp(-alpha0_Pf[j] + alpha1_Pf*RC_Pf[i,j] + alpha2_Pf*RC_Pf[(i-1),j]);
    }
  }
  
}


model {

  alpha1_Pf ~ normal(0, 31.2);
  alpha2_Pf ~ normal(0, 31.2);
  
  f_Pf ~ uniform(0.7, 0.9); 
  s~ beta(1, 10);
  
  for ( j in 1:J) {
   
    r[j] ~ beta(1,10);
   
    
    beta0_test[j] ~normal(0, 10);
    beta0_att[j] ~normal(0, 10);
    
    alpha0_Pf[j] ~ gamma(1,1);
    
    // initialize values 
    Endemic_Pf[1,j]~uniform(0,10);
    Endemic_Pf[2,j]~uniform(0,10);
   
    
    
    for ( i in 1:I) {
      
   // OBSERVATION PROCESS
     
      
      
      attendees[i,j] ~ poisson(lambda_attending[i,j]);
      fever[i,j]~ poisson(lambda_fever[i,j]);                    
      tested[i,j]~ poisson(P_TEST[j]*fever[i,j]);
      Pf_confirmed[i,j]  ~ poisson(pconfirm_Pf[i,j]);
    }
    
    // MALARIA PROCESS - Pf
    for ( i in 2:(I-1)) {
      Endemic_Pf[(i+1),j] ~ normal(N2[j]*lambda_E_Pf[i,j],N2[j]*lambda_E_Pf[i,j]);
    }
  }
}

mhollanders · June 13, 2025, 9:52pm

I’m sure there’s a better way to do it but I’d start putting print statements for all of your immediate containers and see if anything is -Inf or nan. That’s helped me resolve this issue too many times to count.

tek · June 13, 2025, 10:30pm

Ah, that’s a good idea. I had print statements for the variables in the transformed parameters when I was debugging the initial version, but for some reason I hadn’t thought to do it on all the other parameters. Thanks!

edit:
Oh my gosh, I hadn’t entirely read the entire documentation on print statements and missed the super relevant section on debugging via print and target().

nhuurre · June 14, 2025, 5:22am

Unfortunately print() does not print gradients. When the target is finite but has nonfinite gradient, there’s not much you can do other than 1) set init=0 or some plausible values and 2) delete parts of the model until it initializes.

Looking at the code, maybe lambda_E_Pf becomes so extreme that the “malaria process” normal distribution at the end bugs out.

andrjohns · June 14, 2025, 6:41am

I think the issue might be due to Endemic_Pf being declared as unbounded positive:

array[I,J] real <lower=0> Endemic_Pf;

But the model likelihoods only have support from 0-10:

    Endemic_Pf[1,j]~uniform(0,10);
    Endemic_Pf[2,j]~uniform(0,10);

Which means that the parameter could be initialised with a value greater than the likelihood is defined for.

If you add upper=10 to the Endemic_Pf declaration, does that change anything?

tek · June 14, 2025, 11:36am

Thanks for all the suggestions.

adding an upper bound to Endemic_Pf didn’t fix things, nor did setting init to 0.

I’ll probably try cutting down the model to start from a simplified point that will hopefully run and build back from there. I could also take some of the parameter estimates from the jags model and use those for a more fleshed out set of initial conditions.

Topic		Replies	Views
Gradient evaluated at the initial value is not finite Modeling rstan , fitting-issues	1	571	January 1, 2022
Initial value rejected Modeling	21	8782	January 31, 2023
How to debug "Gradient evaluated at the initial value is not finite" error Modeling ode	2	870	June 5, 2021
Issue with model: Gradient evaluated at the initial value is not finite Modeling rstan , specification , multinomial-response , hierarchical-model , dirichlet-multinomial	9	487	September 14, 2023
Debug help Modeling	2	796	November 29, 2017

How to debug "Gradient evaluated at the initial value is not finite."

Related topics