As @jonahsuggested let’s have this post to discuss the best way to collaborate on this and whether it should be part of an existing package or a new one.

As a start we could also collect models to be implemented:

What would be an excellent approach would be to allow a prior on the shape of the hazard function, modeling the hazard as a cubic or linear spline, and to make the covariate space parametric (allowing regression splines).

Addendum: a powerful and useful thing to implement would be a model in which priors favor proportional hazards and linearity of covariate effects but also allow covariate x time interaction tensor spline. This is along the lines of Kooperberg and a paper on penalization of time interactions by Robert Gray.

@harrelfe Thanks for tips and references. Glad to have you in on the discussion here! Personally I haven’t done too much work in the area of survival models before, but I can advise whoever works on this (e.g. @ermeel, @kaybenleroll and others) about anything related to the software implementation. Would you be open to continuing to chime in and advise on the survival modeling concepts and best practices?

Would you use (natural cubic) splines to model / smooth

the hazard function or

the (log) cumulative hazard function, like e.g. in Royston & Parmar?

The interpolation between linear and cubic could be achieved through appropriate priors for \gamma (coefficients of the spline basis functions) in the notation of Royston & Parmar?!

In the Royston & Parmar notation / framework, I presume this could be achieved through appropriate priors on \gamma_{j\ell} for \ell\ge 1?!

Could you let me know the precise reference(s) you mean here?

Sorry for sticking to Royston & Parmar so far, it’s just that I had to start somewhere, and the paper was already printed and at the top of my stack :-)

Best to use a spline on log hazard or log cumulative hazard, to force correct mathematical constraint. Royston & Parmar may be OK, just wished they had studied previous papers in the same journal. I haven’t studied the Royston & Parmar notation yet. For Kooperberg see the R polspline package and http://data.vanderbilt.edu/fh/Papers/survival/koo95haz.pdf

Thanks will check it. While implementing Royston & Parmar, I realised that Bayesian formulation might run into problems with their choice of natural cubic splines, in particular have a look at the following

Here l is the factor to the likelihood which comes from a specific observation. The problem I see is that in case of an uncensored observation, one might have a value of l that is negative, due to the derivative of s factor. I presume this is no problem for an MLE treatment, since optimisation doesn’t care whether \ell is positive or negative, but when building our likelihood in Stan with target+= increments we have to use the log of l.

Not sure how to guarantee that l will be positive, either by using very tight priors or the \gamma parameters or one should think about monotone splines, for which I presume the derivative s should be non-negative.

Actually Royston & Parmar state

The estimate of g[S_0(t)] must theoretically be monotone in t, whereas natural cubic splines, which are constrained to be linear beyond certain extreme observations, are not globally monotone. However, the linearity constraint imposes monotonicity in the tail regions where the observed data are sparse, whereas in regions where data are dense (and provided the sample size is not too small), monotonicity is effectively imposed by the data themselves.

Note that with x=\log t

g[S_0(t)] = s(x;\gamma)

but then they go on:

The use of monotone splines, as in Shen [14], makes the computational problem much more difficult and awkward, a cost we do not feel is in general justified.

I was wondering whether the idea to define the vector of derivative’s (over all data points which are uncensored) to be positive constrained in the transformed parameters block, would solve this problem here (+not cause too much complexity in the posterior to complicate inference using HMC)?

I wish I had the experience to speak to your question. Regarding the negative contribution to the likelihood, that seems to be a general question beyond the scope of the particular spline basis, and would welcome some input from others. On the monotonicity constraint on the cumulative hazard or log cumulative hazard, it would be nice if there were another flexible function with built-in monotonicity, or a different spline basis. At the end we do need to have an analytic form of the estimated hazard function. For now we don’t want to have monotonicity constraints on covariate effects.

I remember having seen monotonic splines in the scam R package with corresponding papers explaining the underlying math. Maybe this can be helpful to us.

@arya had a presentation at this year’s StanCon that implemented I-splines (https://github.com/pourzanj/Stancon2018_Alzheimers). It wasn’t applied to cumulative hazards, but it means monotone splines have been successfully implemented in Stan at least once. I made a note of it then because that “natural cubic splines are close enough” argument did not seem like it was going to work in Stan for the reasons noted above. I was honestly hoping the person who first mentioned Royston and Parmar had come up with a clever way around it!

@harrelfe, by analytic form do you mean there must be a closed form for the instantaneous hazard or the cumulative hazard? I agree we absolutely need a closed form for the instantaneous hazard, but why couldn’t you evaluate the spline function at some quadrature points and use that to get the cumulative hazard? (Alternately, perhaps obtaining the cumulative hazard could be a use for the 1d integrator.)

As someone who does both time-to-event modeling and consulting with applied researchers, I don’t feel comfortable with the major Stan implementation of survival models being part of a discipline-specific R package like survHE. Time-to-event models are much more broadly applicable than just health economics— too much of the target audience is going to miss it if it’s cloaked in the language of another application area. For that reason, I’d strongly prefer either rstanarm or a standalone survival package.

We will figure something out, I will look through the literature and think carefully…

I think that’s what’s basically done in the event sub-model in stan_jm, by @sambrilleman.

Why Royston & Parmar apply splines to the cumulative hazard function and not to the hazard function itself is argued by them as follows:

We smooth the transformed survival function rather than the hazard function [12] because we anticipated that ‘end effects’, artefacts in the fitted spline functions at the extremes of the time range, would be more severe for the hazard function.

I agree with your assessment - it should be in a more general-purpose package for survival analysis. Regarding analytic form, I’d be happy to have any one of these functions to be represented analytically: S(t), \lambda(t), or \Lambda(t). It wouldn’t be the end of the world to just have to graph all three though. What I liked about the approach that Herndon and Shen and I did was that you have an analytic form of \Lambda(t) even with step-function time-dependent covariates.

That might actually be a neat option, since (as far as I understand it) the I-splines are defined by integrating M-splines, thus the derivatives required for the likelihood at uncensored points (see above) would simply be the M-splines. However not clear to me how restrictive the set of monotone functions that can be represented in this way is…

You can put the spline on the log instantaneous hazard and with natural cubic splines you can get an explicit integral that would be monotonic. “Works good” as they say on old car adds.

I see that this will be monotonic, since it is integrating a non-negative function (exp of log instant. hazard). I presume you mean the integral would have an explicit analytic representation in this case? That is one essentially knows the integrals of exponentials of cubic polynomials? Wow…

There are details but you can do a lot analytically or with nicer approximations. The easiest to implement might be radial b-spline basis since there you don’t even exponentiate and the penalty term could be explicit too.