Understanding the cause of divergent transitions (no apparent funnel behavior)

I’m working on a model and keep getting the infamous warning about divergent transitions after warmup. When I’ve encountered this before, it has been due to some funnelish behavior, typically such that some variance parameter moves between small and large values and cause the marginal distribution of some other parameters to fluctuate between narrow and wide. I’ve then been able to transform away the issue. In this case, however, I don’t see any such behavior. In fact, the flagged transitions (red in the attached figure) seem to be distributed in proportion to the density. Does anybody have any ideas of how to identify which parameter(s) are/is causing the problem based on the behavior? The plot has all the parameters of the performed analysis. I expect the banana correlation between the first two parameters would cause poor mixing, but does it also cause divergent transitions? It’s not a trivial task to transform the parameters and I’d be happy for any pointers about which one(s) to focus on rather than explore through trial and error.

The attached figure used adapt_delta=0.99, but I’ve also tried with adapt_delta=0.999 and still get warnings about divergent transitions, if however fewer.

PS. I’ve not included the code because the full code has many additional parameters that has been turned off in the example posted here. Happy to post it if someone’s interested and/or think it helps to understand the behavior.

Sorry, short on time, but I think what you are seeing is the model being only weakly identified, see my blog on the topic for some more thoughts.

Yeah same. I’ve been looking at this awhile and just haven’t responded. This is a good question but hard to respond to.

This is the thread to read: Getting the location + gradients of divergences (not of iteration starting points)

But the short story is sometimes it’s just hard to get anything out of these plots.

We’re in guess and check mode. Do you mind posting the model? Numeric things can lead to divergences. Not doing things on the log scale, for instance (having any explicit exps and such).

2 Likes

Thanks for good input. A problem with weak identifiability makes sense. The model has power exponents and coefficients for the included factors, which would counteract each other.

Here’s the code. Only height exponents and coefficients were included in the model behind the above figure.

data {
  int<lower=1> N;
  real<lower=0> logAge[N];
  real<lower=0> Diam[N];
  real<lower=0> Height[N];
  real<lower=0> Bonitet[N];
  real<lower=0> SoilDepth[N];
  real<lower=0> Prunes[N];
  int<lower=1> SoilType[N];
  int<lower=1> NSoilTypes;
  
  int<lower=0, upper=1> includeDiam;
  int<lower=0, upper=1> includeHeight;
  int<lower=0, upper=1> includeBonitet;
  int<lower=0, upper=1> includeSoilDepth;
  int<lower=0, upper=1> includePrunes;
  
  int<lower=0, upper=1> includeSoilType;
  
  int<lower=0, upper=1> includeDiamExp;
  int<lower=0, upper=1> includeHeightExp;
  int<lower=0, upper=1> includeBonitetExp;
  int<lower=0, upper=1> includeSoilDepthExp;
  int<lower=0, upper=1> includePrunesExp;
  
  
  
  real<lower=0> SIGMADiam;
  real<lower=0> SIGMAHeight;
  real<lower=0> SIGMABonitet;
  real<lower=0> SIGMASoilDepth;
  real<lower=0> SIGMAPrunes;
  
  real<lower=0> SIGMAlogsigma0;
  real<lower=0> SIGMAsigmacoef;
  
  real<lower=0> DiamAlpha;
  real<lower=0> HeightAlpha;
  real<lower=0> BonitetAlpha;
  real<lower=0> SoilDepthAlpha;
  real<lower=0> PrunesAlpha;
  
  int<lower=0, upper=1> doexactloo;
  real logAgePred[doexactloo ? 1 : 0];
  real DiamPred[doexactloo ? 1 : 0];
  real HeightPred[doexactloo ? 1 : 0];
  real BonitetPred[doexactloo ? 1 : 0];
  real SoilDepthPred[doexactloo ? 1 : 0];
  real PrunesPred[doexactloo ? 1 : 0];
  int<lower=1> SoilTypePred[doexactloo ? 1 : 0];
  
}


parameters {
  real DiamCoef[includeDiam ? 1 : 0];
  real HeightCoef[includeHeight ? 1 : 0];
  real BonitetCoef[includeBonitet ? 1 : 0];
  real SoilDepthCoef[includeSoilDepth ? 1 : 0];
  real PrunesCoef[includePrunes ? 1 : 0];
  real SoilTypeCoef_raw[includeSoilType ? NSoilTypes : 0];
  
  real<lower=0> DiamExp[includeDiamExp ? 1 : 0];
  real<lower=0> HeightExp[includeHeightExp ? 1 : 0];
  real<lower=0> BonitetExp[includeBonitetExp ? 1 : 0];
  real<lower=0> SoilDepthExp[includeSoilDepthExp ? 1 : 0];
  real<lower=0> PrunesExp[includePrunesExp ? 1 : 0];
  
  real logsigma0;
  real sigmacoef;
  
  real intercept;
  real<lower=0> SIGMASoilType[includeSoilType ? 1 : 0];
}

transformed parameters {
  real mu[N];
  real<lower=0> sigma[N];
  real SoilTypeCoef[includeSoilType ? NSoilTypes : 0];
  real DiamContrib[N];
  real HeightContrib[N];
  real BonitetContrib[N];
  real SoilDepthContrib[N];
  real PrunesContrib[N];
  real SoilTypeContrib[N];
  real mupred[doexactloo ? 1 : 0];
  real sigmapred[doexactloo ? 1 : 0];
  real DiamContribPred[doexactloo ? 1 : 0];
  real HeightContribPred[doexactloo ? 1 : 0];
  real BonitetContribPred[doexactloo ? 1 : 0];
  real SoilDepthContribPred[doexactloo ? 1 : 0];
  real PrunesContribPred[doexactloo ? 1 : 0];
  real SoilTypeContribPred[doexactloo ? 1 : 0];
  
  if (includeDiam){
    
    
    for (i in 1:N){
      if (includeDiamExp){
        DiamContrib[i]=DiamCoef[1]*Diam[i]^DiamExp[1]; 
        if (doexactloo){
          
          DiamContribPred[1] = DiamCoef[1]*DiamPred[1]^DiamExp[1];
        }
        
      }else{
        
        DiamContrib[i]=DiamCoef[1]*Diam[i]; 
        if (doexactloo){
          
          DiamContribPred[1] = DiamCoef[1]*DiamPred[1];
        }
      }
      
    }
    
  }else{
    for (i in 1:N){
      
      DiamContrib[i]=0; 
      if (doexactloo){
        
        DiamContribPred[1] = 0;
      }
    }
    
  }
  
  if (includeHeight){
    
    
    
    for (i in 1:N){
      if (includeHeightExp){
        HeightContrib[i]=HeightCoef[1]*Height[i]^HeightExp[1]; 
        if (doexactloo){
          
          HeightContribPred[1] = HeightCoef[1]*HeightPred[1]^HeightExp[1];
        }
        
      }else{
        
        HeightContrib[i]=HeightCoef[1]*Height[i]; 
        if (doexactloo){
          
          HeightContribPred[1] = HeightCoef[1]*HeightPred[1];
        }
      }
      
    }
    
  }else{
    for (i in 1:N){
      
      HeightContrib[i]=0; 
      if (doexactloo){
        
        HeightContribPred[1] = 0;
      }
    }
    
  }
  
  if (includeBonitet){
    
    
    
    for (i in 1:N){
      if (includeBonitetExp){
        BonitetContrib[i]=BonitetCoef[1]*Bonitet[i]^BonitetExp[1]; 
        if (doexactloo){
          
          BonitetContribPred[1] = BonitetCoef[1]*BonitetPred[1]^BonitetExp[1];
        }
        
      }else{
        
        BonitetContrib[i]=BonitetCoef[1]*Bonitet[i]; 
        if (doexactloo){
          
          BonitetContribPred[1] = BonitetCoef[1]*BonitetPred[1];
        }
      }
      
    }
    
  }else{
    for (i in 1:N){
      
      BonitetContrib[i]=0; 
      if (doexactloo){
        
        BonitetContribPred[1] = 0;
      }
    }
    
  }
  
  if (includeSoilDepth){
    
    
    
    for (i in 1:N){
      if (includeSoilDepthExp){
        SoilDepthContrib[i]=SoilDepthCoef[1]*SoilDepth[i]^SoilDepthExp[1]; 
        if (doexactloo){
          
          SoilDepthContribPred[1] = SoilDepthCoef[1]*SoilDepthPred[1]^SoilDepthExp[1];
        }
        
      }else{
        
        SoilDepthContrib[i]=SoilDepthCoef[1]*SoilDepth[i]; 
        if (doexactloo){
          
          SoilDepthContribPred[1] = SoilDepthCoef[1]*SoilDepthPred[1];
        }
      }
      
    }
    
  }else{
    for (i in 1:N){
      
      SoilDepthContrib[i]=0; 
      if (doexactloo){
        
        SoilDepthContribPred[1] = 0;
      }
    }
    
  }
  
  if (includePrunes){
    
    
    
    for (i in 1:N){
      if (includePrunesExp){
        PrunesContrib[i]=PrunesCoef[1]*Prunes[i]^PrunesExp[1]; 
        if (doexactloo){
          
          PrunesContribPred[1] = PrunesCoef[1]*PrunesPred[1]^PrunesExp[1];
        }
        
      }else{
        
        PrunesContrib[i]=PrunesCoef[1]*Prunes[i]; 
        if (doexactloo){
          
          PrunesContribPred[1] = PrunesCoef[1]*PrunesPred[1];
        }
      }
      
    }
    
  }else{
    for (i in 1:N){
      
      PrunesContrib[i]=0; 
      if (doexactloo){
          
          PrunesContribPred[1] = 0;
        }
    }
    
  }
  
  if (includeSoilType){
    
    for (t in 1: NSoilTypes){
      SoilTypeCoef[t]=SoilTypeCoef_raw[t] * SIGMASoilType[1];// reparametrisation of chi for improved computation
      
    }
    
    for (i in 1:N){
      
      
      SoilTypeContrib[i]= SoilTypeCoef[SoilType[i]];
      if (doexactloo){
          
          SoilTypeContribPred[1] = SoilTypeCoef[SoilTypePred[1]];
        }
      
    }
  }else{
    for (i in 1:N){
      
      SoilTypeContrib [i]= 0;
      
      if (doexactloo){
          
          SoilTypeContribPred[1] = 0;
        }
      
    }
    
  }
  
  for (i in 1:N){
    mu[i]=intercept + DiamContrib[i] + HeightContrib[i] + BonitetContrib[i] + SoilDepthContrib[i] + PrunesContrib[i] + SoilTypeContrib[i];
    sigma[i]=exp(logsigma0+sigmacoef*mu[i]);
  }
  
  if (doexactloo){
    mupred[1] =intercept + DiamContribPred[1] + HeightContribPred[1] + BonitetContribPred[1] + SoilDepthContribPred[1] + PrunesContribPred[1] + SoilTypeContribPred[1];
    sigmapred[1]=exp(logsigma0+sigmacoef*mupred[1]);
    
  }
  
}
model {
  for (i in 1:N){
    
    logAge[i] ~ normal(mu[i], sigma[i]);
    
  }
  
  if (includeDiam){
    
    DiamCoef[1] ~ normal(0,SIGMADiam);
    if (includeDiamExp){
      
      DiamExp[1] ~ lognormal(0,DiamAlpha);
      
      
    }
    
  }
  
  if (includeHeight){
    
    HeightCoef[1] ~ normal(0,SIGMAHeight);
    
    if (includeHeightExp){
      
      HeightExp[1] ~ lognormal(0,DiamAlpha);
      
      
    }
  }
  
  if (includeBonitet){
    
    BonitetCoef[1] ~ normal(0,SIGMABonitet);
    if (includeBonitetExp){
      
      BonitetExp[1] ~ lognormal(0,DiamAlpha);
      
      
    }
    
  }
  
  if (includeSoilDepth){
    
    SoilDepthCoef[1] ~ normal(0,SIGMASoilDepth);
    
    if (includeSoilDepthExp){
      
      SoilDepthExp[1] ~ lognormal(0,DiamAlpha);
      
      
    }
  }
  
  if (includePrunes){
    
    PrunesCoef[1] ~ normal(0,SIGMAPrunes);
    
    if (includePrunesExp){
      
      PrunesExp[1] ~ lognormal(0,DiamAlpha);
      
      
    }
  }
  
  if (includeSoilType){
    for (t in 1:NSoilTypes){
      SoilTypeCoef_raw[t] ~ normal( 0 , 1 ); //  reparametrisation for improved computation
      //SoilTypeCoef[t] ~ normal(0,SIGMASoilType[1]);
      
    }
    
    SIGMASoilType[1] ~ cauchy(0,1);
    
    
  }
  
  
  logsigma0 ~ normal(0,SIGMAlogsigma0);
  sigmacoef ~ normal(0,SIGMAsigmacoef);
}

generated quantities {
  vector[doexactloo ? 1 : N] log_lik;
  if (doexactloo){
    log_lik[1] = normal_lpdf(logAgePred[1]|mupred[1] , sigmapred[1]);
    
  }else{
    for (i in 1:N){
      
      log_lik[i] = normal_lpdf(logAge[i]|mu[i] , sigma[i]);
      
    }
    
    
  }
  
  
  
}

This code is discussed also in another thread, and I think it would be better to use splines for non-linearities and use e.g. brms for making the model.