# Null term for log_sum_exp()

Hi, relative to this post, but of more general (or particular) interest. I wanted to understand how I can can program the function log_sum_exp() to not increase for an element of the input vector, when a certain condition is true, i.e.

  for (n in 1:N) {
vector[K] lps = a;
for (k in 1:K){
if (condition == TRUE)
lps[k] += x;
else
lps[k] += normal_lpdf(y[n] | mu[k], sigma);
}
target += log_sum_exp(lps);
}


where a is such that a_i = 0 whenever the condition is TRUE. My question is, what value of x should I put? I am asking this, because from the documentation, what I understand is that x = - \infty, but it does not work when I use negative_infinity().

Revisiting this, I think that the solution might simply be:

for (n in 1:N) {
vector[K] lps;
for (k in 1:K){
if (condition == TRUE) {
lps[k] = normal_lpdf(y[n] | mu[k], sigma);
}
}
target += log_sum_exp(lps);
}

1 Like

negative_infinity() seems to work for me (or at least agrees with other implementations like that in scipy):

transformed data {
vector[3] x = [0.1, negative_infinity(), 0.3]';
print(log_sum_exp(x));
}


prints 0.898139

>>> from math import exp, log
>>> log(exp(0.1) + exp(0.3))
0.8981388693815919


What issue were you having when using it?

I tried this one and it does not work, the error is:

Chain 1:   Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.


Also, how do you define the vector lps for those kâ€™s for which the condition is FALSE?

I donâ€™t believe you need to define those lps[k]s at all, though youâ€™ll have to verify that. But the error youâ€™re receiving suggests that your entire log_sum_exp(lps) which should be mathematically equivalent to ln(e^lps[k1]+â€¦+e^lps[kK]) is summing to 0 inside of the ln. That means that your normal_lpdf(y[n] | mu[k], sigma) is negative infinity for all values of y[n] and mu[k]. There must be some other issue with your code that is driving this. Can you post your entire code as youâ€™re currently running it and also the version of Stan and the interface youâ€™re using?

If you donâ€™t assign a value then those lps[k]s are initialized to NaN (â€śNot A Numberâ€ť) value. Including any NaNs in the log_sum_exp poisons the whole sum so that the result is also NaN. I believe Stan reports log(NaN) log probability as if it were log(0); neither is a viable starting point.

Thatâ€™s good to know. However, it seems like they are getting log(0) when they assign it as negative_infinity(), so all of the lps[k]s must still be evaluating to negative_infinity(). This could perhaps point to their condition always feeding the if statement toward lps[k] = negative_infinity(); and never lps[k] = normal_lpdf(y[n] | mu[k], sigma); or it could indicate a problem with the y[n]s and/or the mu[k]s

I managed to solve the issue by replacing += with = for the negative_infinity().

  for (n in 1:N) {
vector[K] lps = a;
for (k in 1:K){
if (condition == TRUE)
lps[k] = negative_infinity();
else
lps[k] += normal_lpdf(y[n] | mu[k], sigma);
}
target += log_sum_exp(lps);
}