Hello,
In my model, I have a term of the following form: \frac{1-e^x}{1-k e^x}, where k>0 and x can range from -inf to inf.
In my case, such a term appears in several closed form expressions describing the simple birth and death process (a.k.a. Kendall process, a Markov birth death process). For example, the probability of the process going extinct by time t is P_{extinct}[t;\lambda,\mu]=\frac{\mu}{\lambda} \left( \frac{e^{(\lambda-\mu)t}-1}{e^{(\lambda-\mu)t}-\frac{\mu}{\lambda}} \right).
I am running into issues with numerical precision when the x in \frac{1-e^x}{1-k e^x} approaches -inf. (In my case, x=(\lambda-\mu)t and t is my data in minutes. Some values of t reach ~80 minutes, which makes x take on values like -60.) The term approaches 1, but I cannot have the term equal 1 exactly.
In one case I’ve debugged, I can’t have the term equal to 1 because I am using it as a probability p_extinct
in a simplex: simplex theta = [prob_extinct, prob_event2, prob_event3]
. The components of theta
must sum up to 1. When I calculate prob_extinct
=1 and prob_event2
= 1e-80, for example, then attempt prob_event3 = 1 - prob_extinct - prob_event2
, then I get prob_event3
< 0, which leads to a NaN if I work with log probabilities. Alternatively, if I compute prob_event3
separately, Stan throws an error because the sum of the terms of theta
is >1. My Stan model fails to initialize when run because the initial 100 parameter guesses (e.g. for \lambda and \mu) always lead to theta
containing NaN’s.
Computing \log(\frac{1-e^x}{1-k e^x}) as log1m_exp(x) - log1m_exp( x + log(k) )
helps with precision of the numerator and denominator separately, but doesn’t get around the fact that I’m subtracting two similar numbers from each other. Is there another clever way to compute \frac{1-e^x}{1-k e^x} using the composed functions available in Stan?
Or is the problem elsewhere, perhaps in that I am not using simplex variables in their intended usage? Should I have an if statement like if(p_extinct == 1){ return(1 - machineeps()) }
to make sure I never return exactly 1 for p_extinct
?
In my model, the simplex is used like this: mycountdata ~ multinomial( theta )
.