For a time-to-event (survival) analysis, I aim to calculate the probability of an event occurring within an interval, e.g. between time t and (t+1), where the time-to-event probability is drawn from a continuous distribution.
The time-to-event distribution could be Exponential, Gamma or Weibull. For this example I will use the simpler exponential distribution:
The probability of the event within the interval (conditional on the event not having previously occured) is given by \int_t^{t+1} 1-e^{- \lambda x} dx ; which can be written as (1-e^{- \lambda (t+1)}) - (1-e^{- \lambda t})
Or in stan syntax: exponential\_cdf(t+1~ |~ \lambda) - exponential\_cdf(t ~|~ \lambda)
The key part of model code on the log scale is therefore:
transformed parameters {
vector[N] lp; //log probability for an interval
for(i in 1:N)
lp[i] = log_diff_exp(exponential_lcdf(t[i]+1 | lambda),
exponential_lcdf(t[i] | lambda));
}
However, fitting this model results in an inability to initialise model parameters:
Chain 1 Rejecting initial value:
Chain 1 Gradient evaluated at the initial value is not finite.
Chain 1 Stan can’t start sampling from this initial value.
Chain 1 Rejecting initial value:
Chain 1 Log probability evaluates to log(0), i.e. negative infinity.
Chain 1 Stan can’t start sampling from this initial value.
Chain 1 Initialization between (-2, 2) failed after 100 attempts.
I have fitted many variations of this model and it seems that subtracting the cumulative probabilities is causing the difficulties. A simplified version of the full code is given below.
Can anyone explain this behaviour?
data {
int <lower=1> N; //number of age classes
array[N] int<lower=1> tested; //number tested per age class
array[N] int<lower=0> positive; //number positive per age class
array[N] int<lower=1> age_lower; //lower interval of age class
array[N] int<lower=age_lower> age_upper; //upper interval of age class
}
parameters {
real<lower=0> lambda; //rate of exponential distribution
}
transformed parameters {
vector[N] lp; //log probability within an interval
for(i in 1:N)
lp[i] = log_diff_exp(exponential_lcdf(age_upper[i] | lambda),
exponential_lcdf(age_lower[i] | lambda));
}
model {
for(i in 1:N)
target += binomial_lpmf(positive[i] | tested[i], exp(lp[i]));
}