Weibull Survival model for strongly censored data - Stan does not recover sample distribution values


Dear all,

I am very much interested in survival models and I would like to build a survival model in Stan from scratch to really understand each step. On the way from a simple model to a more complex model I may have fond something that doesn’t seem right to or it is just my lack of understanding. One of the two possibilities is more likely. ;)

Problem: As I understand the base stan-model with quasi flat priors for the data should be consistent to the MLE as calculated by i.e. by flexsurv. This is approx. the case for data with 50% censored data points, but for strongly censored data retaining the sample distribution values back worked really bad.

Did I overlook something?


  • Fit and model parameters look healthy (Rhat, N_eff, mixing)
  • For a lightly right censored data with equivalent number of failures and suspensions the stan results are close to the ones from flexsurv
  • If the ratio of censored to uncensored data is at about 5:1 the form-parameter estimate is still OK, but the scale gets a really wide standard deviation, spanning the whole way from almost zero far above the expected value. While the flexsurv result is still relatively close to the expected values.
  • It seems that the contribution from the weibull_lccdf part seams to negatively influence the result from weibull_lpdf
  model {
    target += weibull_lpdf(yobs | form, scale);
    target += weibull_lccdf(ycen | form, scale);
    form ~ normal(0,3000);
    scale ~ normal(0,300000);
# Standard method in R
input %>% flexsurv::flexsurvreg(Surv(t, FAILURE) ~ 1, data = ., dist ="weibull";)

I can provide a full example if needed, but maybe this is sufficient as a starting point. Beside I have to solve an issue with generating sample / dummy data with heavily right censored data points (with much more censored than failed data) prior to posting to provide some sample data with it. But I assume that is a question for a different forum.

Thanks for your support! :)



I wouldn’t bet on this completely, also it is IMHO worth to be pedantic about what “consistent” means. I think than in theory posterior mode should be close to the MLE, but

a) default Stan output does not report posterior mode, only posterior mean and median, but those would be different if the posterior is skewed or otherwise not very nice (which is IMHO highly likely with large amount of censoring) and
b) I’ve never used flexsurv, but in some frequentist packages the default method does employ some form of regularization/penalty/correction that would make the result different from pure MLE (usually because the MLE might not be identified or is otherwise problematic), this would make your model and the one in the package to differ slightly, which may explain some of the discrepancies

Finally, I would strongly recommend to test Stan models starting with data simulated exactly as your model assumes (draw hyperparameters from priors, draw the actual parameters then draw data). Once your model works on that it makes sense to check how robust it is to data simulated in a different way. Also this means you already have prior predictive checks which are useful for deciding good priors.