# Issue with Survival Models

I am running into an issue fitting a three parameter generalized gamma survival model.

I am using the parameterization from the flexsurv package and have checked that survival and density functions are correct. It seems that the model runs into convergence issues when evaluating the survival function because of the integral relating to the gamma cumulative density function? For other survival models (i.e. Weibull) which have survival functions which do not require a cumulative density function do not have such an issue.

I know that there is a way around this as per the survHE package, however, I am wondering if there is a way to still specify the model likelihood using the density and survival functions?

functions {
real gen_gamma_lden(real x, real mu, real sigma, real Q) {
// Uses the same parameterisation as flexsurv
// mu = location
// sigma = scale
// Q = shape
real lden;
real w;
// Constructs the log-density for each observation
w = ((log(x)-mu))/sigma;

lden = -log(sigma*x)+log(fabs(Q))+pow(Q,-2)*log(pow(Q,-2))+pow(Q,-2)*(Q*w-exp(Q*w))-lgamma(pow(Q,-2));

return lden;

}
\\Survival function
real Surv_gengamma(real time, real mu, real sigma, real Q){
real Sind;
real qi;
real w;
real expnu;

qi = 1/(Q*Q);
w = (log(time)-mu)/sigma;
expnu = exp(Q*w)*qi;

Sind =  1- gamma_cdf(expnu, qi,1);
return Sind;

}

// Defines the sampling distribution
real gengamma_lpdf (real t, real d, real d2,  real mu, real sigma, real Q) {
real log_lik;
// Issue relates to the inclusion of the survival function;=
log_lik = d*gen_gamma_lden(t,mu,sigma, Q) + d2*log(Surv_gengamma(t,mu,sigma, Q));
//log_lik = d*gen_gamma_lden(t,mu,sigma, Q) ;
return log_lik;
}

}

data {
int<lower=1> n;                      // number of observations
vector[n] t;               // fully observed times
vector[n] d;              // censoring indicators
int<lower=1> H;                         // number of covariates (including intercept)
matrix[n,H] X;                  // matrix of categorical covariates for the valid cases (0/1 => dummy variables)
vector[H] mu_beta;                      // vector of means for the covariates
vector<lower=0>[H] sigma_beta;          // vector of sd for the covariates
real mu_Q;                              // mean for the parameter Q
real<lower=0> sigma_Q;                  // sd for the parameter Q
real<lower=0> a_sigma;                  // first parameter for the scale distribution
real<lower=0> b_sigma;                  // second parameter for the scale distribution

}

transformed data{
vector[n] d2;              // censoring indicators
for (i in 1:n){
if(d[i] == 1){
d2[i] = 0;
}else{
d2[i] =1;
}
}

}

parameters {
real Q;                                 // shape of the Generalised Gamma distribution
real<lower=0> sigma;                    // scale of the Generalised Gamma distribution
vector[H] beta;                         // coefficients for the covariates
}

transformed parameters{
vector[n] linpred;               // rescaled predictor (mu)

linpred = X*beta;

}

model {
// Prior distributions
Q ~ normal(mu_Q,sigma_Q);
sigma ~ gamma(a_sigma,b_sigma);
beta ~ normal(mu_beta,sigma_beta);

for(i in 1:n){
t[i] ~ gengamma(d[i],d2[i], linpred[i],sigma,Q);
}

}

generated quantities {
real mu;
mu = beta[1];
}


set.seed(123)

data.example = data.frame( times = flexsurv::rgengamma(100, 1.2,.9, .7),
status = 1)
#data.example$status[1:5] <-0 stan.data.example <- list() stan.data.example$t <- data.example$times stan.data.example$d <- data.example$status stan.data.example$n <- length(data.example$times) stan.data.example$H <- 2

stan.data.example$X <- matrix(c(rep(1,length(data.example$times)),
rep(0,length(data.example$times))), ncol = 2) stan.data.example$mu_beta <- c(0,0)
stan.data.example$sigma_beta <- c(100,100) stan.data.example$a_sigma <- 0.1
stan.data.example$b_sigma <- 0.1 stan.data.example$mu_Q<- 0
stan.data.example\$sigma_Q <- 100

fit <- rstan::stan(model_code = GenGamma.model, model_name = "GenGammaTest",
data = stan.data.example, iter = 2000, chains = 3)

flexsurvreg(Surv(times,status)~1,data =data.example, dist = "gengamma")


1 Like

Can’t really delve deeply into the problem, but it appears that there are two separate modes for Q, and chains get stuck in the individual modes:

In fact it appears, the model prohibits a range of Q values near zero:

(plots via ShinyStan)

Understanding why this is the case is probably important for moving forward.

Best of luck with your model!

@hhau seems to have encountered something similar before. Any suggestions?

Q is a shape parameter and must be positive, thus should have a <lower = 0> constraint. That would eliminate the symmetry / second mode. Edit: Perhaps not, I see now that flexsurv also permits negative values for Q. It also states that

a further class of models with negative q, and survivor function I(\gamma, u), where z is replaced by −z

but I don’t see this replacement in the code anywhere?

I am also suspicious of the numerical stability of this line:

lden = -log(sigma*x)+log(fabs(Q))+pow(Q,-2)*log(pow(Q,-2))+pow(Q,-2)*(Q*w-exp(Q*w))-lgamma(pow(Q,-2));


I would compute some of these terms separately and print out their value. Terms like log(pow(Q,-2)) will increase very quickly for small Q.

1 Like