Wired baseline hazard estimation


The stan code I used is a translated version from OPENBUGS code from Leuk: Cox regression: http://www.openbugs.net/Examples/Leuk.html

My problem here is that with a good sample size (62 events samples, 2 independent variables), I got very different cumhaz from cox model. The coefficients are quite close (20% difference), but the cumhaz has 10 times of difference. I wonder what I did wrong or how can I improve my model / prior, etc to solve this situation?

Thank you very much for your advice.

Check this out.

Moreover, OpenBugs and Stan for the same model (likelihood and prior) and same data can yield different results (posteriors)…

1 Like

Hi Ermeel, thank you for your input. Do you think this difference is normal just due to randomness? I am comparing stan cumhaz results to Coxph’s.

The question is what do you actually compare? If I am not mistaken, when you refer to cumHaz in the survival package, this uses actually non-parametric estimate (similar to Kaplan Meier), which in particular “pools” all individuals together and does not take into account covariates. This is not the same as the baseline hazard that you have in the Stan model (which is related to the survival function of an oftentimes hypothetical individual that has all covariates at 0). The best quantity to compare to cumHaz or Kaplan Meier is something like the standardised survival curve. See this recent case study together with @sambrilleman (more precisely check out section 3.5.2. for a definition and section 6.3. for a concrete example).

Speaking about the case study, maybe you worth to try out the the convenient stan_surv function referred to therein and the corresponding built in function to do the comparison to Kaplan Meier ps_check.

Thank you so much for the detailed reply!