Error in initializing due to nonfinite gradient rejection because of -infinite values in transformed parameters

helske · April 15, 2021, 6:03am

I have a model where I construct a partially fixed matrix in the transformed parameters block, so that diagonal and parts of the upper diagonal matrix are estimated and others are set to -Inf. However, this seems to result in nonfinite gradient errors, although the likelihood (and priors) are always finite. Here is a toy example of the issue which results in the same error:

data {
  int<lower=1> N;
  real y[N];

}

parameters {
  real mu;
}

transformed parameters {
  vector[2] x;
  x[1] = mu;
  x[2] = negative_infinity();
}
model {
  mu ~ normal(0, 1);
  y ~ normal(x[1], 1);
}

This is of course very artificial example where the issue could be circumvented, but in my practical application, this “x” variable is input to custom likelihood computation function, where it is much easier to do the computations with full matrix containing -Inf values instead of creating custom elementwise computations (the location of -Inf values are fixed case-by-case). Edit: Maybe I should be more direct with the application I’m having: Essentially I am building a constrained transition matrix for a Markov model where some of the transition probabilities are fixed to zero (leading to -Inf in the log-scale).

It seems that this issue was also raised at Github years ago (Get "infinite gradient" error for code with finite gradient · Issue #275 · stan-dev/rstan · GitHub). Any ideas on how to deal with this?

helske · April 15, 2021, 7:11am

I figured out an ugly hack to get rid of the issue in my case:

      // variable tmp below is constructed as theta + x 
      // where x might contain fixed -Inf elements and unknown parameters
      int n_finite = 0;          
       // reorder elements of tmp so that all finite values are at the start
      for(i in 1:M) {
        if(tmp[i] > negative_infinity()) {
          n_finite += 1;
          tmp[n_finite] = tmp[i];
        }
      }
      if(n_finite == 0) {
        ll = negative_infinity(); //log_sum_exp of -Infs is -Inf
      } else {
        ll = log_sum_exp(tmp[1:n_finite]);
      }

caesoma · April 18, 2021, 3:31pm

I think you answered your own question, so what I’m writing here is mostly redundant, just validates what you already said.

Your toy example shows that there may be nothing wrong with the calculations that are actually necessary but an error will still be raised because HMC is a black box that requires no inf in the gradients regardless, and that the only solution may be exclude those values
This is probably not possible in your matrix formulation, so maybe a hacky solution like the one below is the easiest way to go (at least loops are still pretty fast, so the formulation may not look as elegant, but should work fine if the model isn’t too large).

Topic		Replies	Views
Infinite gradient warning when including transformed parameters unrelated to the log likelihood General	4	679	July 7, 2020
Rejecting initial value Modeling fitting-issues	4	2812	November 22, 2018
Rejecting intial value: location parameter is - Inf but must be finite! Modeling	1	544	March 17, 2023
Gradient evaluated at the initial value is not finite Modeling rstan , fitting-issues	1	568	January 1, 2022
Yet another case of (Rejecting initial value: Gradient evaluated at the initial value is not finite.) Modeling	3	515	July 18, 2022

Error in initializing due to nonfinite gradient rejection because of -infinite values in transformed parameters

Related topics