Survival analysis: initial values rejected

Hi, I am trying to do survival analysis, and saw this rstan code posted on Github, so I copied it into R, and used the provided data, but got the initial values rejection error. I have been working on it for 2 weeks, did a lot of googling in my efforts, tried to revise code many times, but still no luck.
Any suggestions are appreciated, thanks!

Here is the data https://github.com/fonnesbeck/stan_workshop_2016/blob/master/code/survival.data.r
the code: https://github.com/fonnesbeck/stan_workshop_2016/blob/master/code/survival.r

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

library(ggplot2)
library(plyr)
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

cox_model <- 
"data {
    int<lower=0> Nsubj; 
    int<lower=1> T;
    vector[Nsubj] pscenter;
    vector[Nsubj] hhcenter;
    int<lower=0,upper=1> ncomact[Nsubj];
    int<lower=0,upper=1> rleader[Nsubj];
    int<lower=0,upper=1> dleader[Nsubj];
    vector[Nsubj] inter1;
    vector[Nsubj] inter2;
    int<lower=0,upper=1> FAIL[Nsubj];
    int<lower=0> obs_t[Nsubj];
    int<lower=0> t[T+1];
    
} 
transformed data {
    int<lower=0> Y[Nsubj, T];
    int<lower=0> dN[Nsubj, T];
    
    # Set up data
    for(i in 1:Nsubj) {
        for(j in 1:T) {
            # risk set = 1 if obs_t >= t
            Y[i,j] <- int_step(obs_t[i] - t[j]);
            # counting process 
            dN[i,j] <- Y[i, j] * int_step(t[j + 1] - obs_t[i]) * FAIL[i];
        }
    }
}
parameters {
    vector[7] beta;
    real<lower=0> c;
    real<lower=0> r;
    vector[T] dL0;   
}
transformed parameters {
    vector[T] mu;
    matrix[Nsubj, T] Idt;
    vector[T] dL0_star;
    
    for(j in 1:T) {
        for(i in 1:Nsubj) {
            # Intensity
            Idt[i, j] <- Y[i, j] * exp(beta[1]*pscenter[i] + beta[2]*hhcenter[i] + beta[3]*ncomact[i] 
                        + beta[4]*rleader[i] + beta[5]*dleader[i] + beta[6]*inter1[i] + beta[7]*inter2[i]) * dL0[j];
        }     
        # prior mean hazard 
        mu[j] <- dL0_star[j] * c;   
        
        dL0_star[j] <- r * (t[j + 1] - t[j]);
    }
} 
model {
    
    for(j in 1:T) {
        for(i in 1:Nsubj) {
            # Likelihood
            dN[i, j] ~ poisson(Idt[i, j]);  
        }     
        dL0[j] ~ gamma(mu[j], c);
    }
    
    c ~ gamma(0.0001, 0.00001);
    r ~ gamma(0.001, 0.0001);
    
    beta ~ normal(0.0, 100000);
}"

data <- read_rdump('survival.data.r')

inits <- list(c1=list(beta=c(-.36,-.26,-.29,-.22,-.61,-9.73,-.23),  c=0.01, r=0.01, 
    dL0=c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)))


fit <- stan(model_code=cox_model, data=data, init=inits, iter=2000, chains=1, verbose=F)

print(fit)

…

Edit: Dr. Chris Foonesbeck replied me in the email, and he said he ran the code without any problem when he wrote it, so he suggested me to use an older version of Rstan. and that’s what I am doing now, I will report the result. I am curious why older version of Rstan works instead of the newest one.

This, at least, looks like a typo:

# prior mean hazard 
mu[j] <- dL0_star[j] * c;   
        
dL0_star[j] <- r * (t[j + 1] - t[j]);

dL0_star is being used before it is being defined. Maybe try swapping that around?

Also, “<-” is deprecated in Stan. Use “=” instead.

1 Like

unfortunately, It didnot help. still gave me same error.

Once the above mentioned stuff is fixed the problem lies with

for(i in 1:Nsubj) {
            // Likelihood
	  dN[i, j] ~ poisson(Idt[i, j]);  
        }   

Which says that the shape parameter is negative which I cannot understand since it comes from:

 for(j in 1:T) {
        for(i in 1:Nsubj) {
            // Intensity
            Idt[i, j] = Y[i, j] * exp(beta[1]*pscenter[i] + beta[2]*hhcenter[i] + beta[3]*ncomact[i] 
                        + beta[4]*rleader[i] + beta[5]*dleader[i] + beta[6]*inter1[i] + beta[7]*inter2[i]) * dL0[j];
        }

	dL0_star[j] = r * (t[j + 1] - t[j]);

        // prior mean hazard 
        mu[j] = dL0_star[j] * c;   
           }

Where Y is an int_step and thus either 1 or 0 and dL0 \sim Gamma() and thus also positive. The rest is in the exponent.

Just to check this I ran

for(i in 1:Nsubj) {
            // Likelihood
	  dN[i, j] ~ poisson(fabs(Idt[i, j]));  
        }   

and it is now sampling (although very slowly on my laptop, I’ll report results).

EDIT:
I only did 100 draws since it was so slow. Observations: the Rhat for Idt is Nan for a lot of values. I don’t know exactly how that is calculated but it doesn’t seem good.

Thanks for looking into it. is fabs() same as abs()? like get the absolute value of “Idt”? That is weird to me as well, since exp() won’t lead to a negative number. It did start sampling after I added fabs(), but will it affect the estimate of beta in someway? Thanks again!

Yes fabs is like abs but for vectors. I got a warning that abs is deprecated. I think we should investigate why Idt is negative. As you say that can technically not be the case. One possibility is that somewhere a log is taken which would make it appropriate to take the exponent (which also works instead of fabs).

Thanks for looking into this @danielw2904. Why does dN[i, j] ~ poisson(Idt[i, j]); imply Idt is negative? The rate parameter to a poisson has to be positive.

Maybe the intent of this was to be parameterized on a log scale? So it’d be dN[i, j] ~ poisson_log(Idt[i, j]); instead? That way these ldt things could be negative and that would make sense.

@Lilia_Feng, before we dig into this too much, have you tried e-mailing chris.fonnesbeck@vanderbilt.edu ? It seems like there might be a problem with his model.

Maybe the best way to find out is ask? You can link him to this Discourse thread to at least show we’ve tried a couple things haha.

No problem. Sorry for not being clear about this. As far as I understand the error message the problem is that the scale parameter is negative which raises an error. As far as I can tell though the parameter cannot be negative since non of its components can be negative (see 2nd code chunk in my reply above). My guess is now that the log of this parameter is somehow taken in the background s.t. values below 1 are negative. I have no idea how that would happen though.

Ah okay, yeah, the rate parameter can’t be negative in a poisson distribution.

Usually folks use link functions and stuff to avoid this issue, like if the rate parameter is on the log scale it makes sense for it to be negative (see poisson_log in the manual).

I’m just nervous changing things like this in the model without stopping to try to figure out what the model is actually doing haha.

Yeah we should definitely figure out the model before changing anything. I just wanted to investigate if that parameter is really the problem. I think emailing the author is the best course of action.

Just in case, you might want to try this parametric survival model I wrote; it’s a variation of the model proposed by Royston & Parmar (search for it here in the forum) that guarantees monotone cumulative baseline hazards and works well in practise (even for large scale survival analyses on >300k patients with around 50 covariates).

In particular, check this for a self contained example, utilising they QR decomposition for covariates.

2 Likes

should probably be

(Using fabs or abs to make something positive that should be positive by construction is probably generally a bad idea. But, of course, ok for diagnosing where the problem might be.)

Definitely should not just take the absolute value. However, just specifying lower=0 is not innocuous either. DL0 is drawn from a gamma so should be positive anyways unless it is really the log of a gamma in which case the exponent would be appropriate.

True, we cannot really know what the original author of the model wanted without asking him/her. One thing to note is that Stan doesn’t use the knowledge of what distributions are specified in the model block in initialization. So the random initialization will draw negative values for dL0 unless lower=0 is specified. (But actually the code in the first post does manual initialization, so I’m not sure if this was the problem after all. Yet the error message seems to imply random initialization?)

Thanks for your help. Yes, I did email him, but haven’t heard back from him yet.

The model is cox proportional hazard model, and “Idt” follows Poisson distribution.

Thanks, but I did try to declare lower=0 for dL0, but it did’t help fix the problem, still rejecting initial values.

Thanks for looking into it, @danielw2904. That will be weird if the log is taken in the background. If it is, should I use “possion_log(Idt) " instead? It does not really makes sense to me. I am new to Stan, and still learning it. but I always read something like " Stan see the log probability”. Is that how Stan works?

I also found this package for survival models, although for python (link is to stan files):

I’d be interested in porting this to an R package if that doesn’t exist yet. Let me know if you are interested and I’ll set a package up in the next couple of weeks.

Edit:
here an existing R package

Alright, well to answer specifically about initialization, it looks like the script is trying to initialize the sampler in a certain place. That’s what these inits are:

inits <- list(c1=list(beta=c(-.36,-.26,-.29,-.22,-.61,-9.73,-.23),  c=0.01, r=0.01, 
   dL0=c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)))

From what @danielw2904 has pointed out so far, it seems like Idt is ending up negative somehow.

The path to debugging this is figuring out which elements of Idt are negative.

The easiest way to do that would be to add print statements:

for(i in 1:Nsubj) {
  # Intensity
  Idt[i, j] <- Y[i, j] * exp(beta[1]*pscenter[i] + beta[2]*hhcenter[i] + beta[3]*ncomact[i] 
                        + beta[4]*rleader[i] + beta[5]*dleader[i] + beta[6]*inter1[i] + beta[7]*inter2[i]) * dL0[j];
  if(ldt[i, j] < 0.0) {
    print("Hey! Something went wrong at indices [", i, ", ", j, "]");
  }
}

And work back from there. Hope that helps!

I’m afraid I just don’t know what that means haha. Search around the forums for other survival models might give you ideas. @ermeel had a big thread on them, but there are probably simpler ones as well.

The fact that this model requires custom inits and is breaking so easy makes me doubt it’s robustness. You’ll probably want to make improvements on this one either way.