Null term for log_sum_exp()

Hi, relative to this post, but of more general (or particular) interest. I wanted to understand how I can can program the function log_sum_exp() to not increase for an element of the input vector, when a certain condition is true, i.e.

  for (n in 1:N) { 
    vector[K] lps = a;  
    for (k in 1:K){
      if (condition == TRUE)
        lps[k] += x;
      else
        lps[k] += normal_lpdf(y[n] | mu[k], sigma);
     }
    target += log_sum_exp(lps);
  }

where a is such that a_i = 0 whenever the condition is TRUE. My question is, what value of x should I put? I am asking this, because from the documentation, what I understand is that x = - \infty, but it does not work when I use negative_infinity().

Revisiting this, I think that the solution might simply be:

for (n in 1:N) { 
  vector[K] lps;  
  for (k in 1:K){
    if (condition == TRUE) {
      lps[k] = normal_lpdf(y[n] | mu[k], sigma);
    }
  }
  target += log_sum_exp(lps);
}
1 Like

negative_infinity() seems to work for me (or at least agrees with other implementations like that in scipy):

transformed data {
   vector[3] x = [0.1, negative_infinity(), 0.3]';
   print(log_sum_exp(x));
}

prints 0.898139

>>> from math import exp, log
>>> log(exp(0.1) + exp(0.3))
0.8981388693815919

What issue were you having when using it?

I tried this one and it does not work, the error is:

Chain 1:   Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.

Also, how do you define the vector lps for those k’s for which the condition is FALSE?

I don’t believe you need to define those lps[k]s at all, though you’ll have to verify that. But the error you’re receiving suggests that your entire log_sum_exp(lps) which should be mathematically equivalent to ln(e^lps[k1]+…+e^lps[kK]) is summing to 0 inside of the ln. That means that your normal_lpdf(y[n] | mu[k], sigma) is negative infinity for all values of y[n] and mu[k]. There must be some other issue with your code that is driving this. Can you post your entire code as you’re currently running it and also the version of Stan and the interface you’re using?

If you don’t assign a value then those lps[k]s are initialized to NaN (“Not A Number”) value. Including any NaNs in the log_sum_exp poisons the whole sum so that the result is also NaN. I believe Stan reports log(NaN) log probability as if it were log(0); neither is a viable starting point.

That’s good to know. However, it seems like they are getting log(0) when they assign it as negative_infinity(), so all of the lps[k]s must still be evaluating to negative_infinity(). This could perhaps point to their condition always feeding the if statement toward lps[k] = negative_infinity(); and never lps[k] = normal_lpdf(y[n] | mu[k], sigma); or it could indicate a problem with the y[n]s and/or the mu[k]s

I managed to solve the issue by replacing += with = for the negative_infinity().

  for (n in 1:N) { 
    vector[K] lps = a;  
    for (k in 1:K){
      if (condition == TRUE)
        lps[k] = negative_infinity();
      else
        lps[k] += normal_lpdf(y[n] | mu[k], sigma);
     }
    target += log_sum_exp(lps);
  }