This is not a bug or issue report but contains information that others might find useful related to understanding the flexible baseline hazard functions used in the rstanarm function stan_surv and how they relate to the Royston-Parmar model that is found in the flexsurv function flexsurvspline. This was explained to me by @sambrilleman and I found it quite interesting. I copy his email here.
Regarding your first question. The RP models use cubic splines on the log cumulative hazard. The M-spline model in stan_surv uses cubic splines on the hazard . The B-spline model in stan_surv uses cubic splines on the log hazard . They all use slightly different splines, but that isn’t the major difference, the major difference is what scale they model things on (i.e. hazard vs log hazard vs log cumulative hazard). Since they are all trying to do something “flexibile and wiggly” on the scale they choose, they are all trying to effectively achieve the same goal. And since the hazard, log hazard, and cumulative log hazard are all transformations of one another then doing something “flexibile and wiggly” on one effectively translates to doing something “flexibile and wiggly” on the other.
RP choose the log cumulative hazard because being of the cumulative hazard already means they don’t need to integrate the hazard to get the cumulative hazard (which is required for evaluating the likelihood) – meaning the RP approach is fast. But the log cumulative hazard has to increase – it is cumulative so it can’t go down! – but the splines RP use don’t enforce that, instead they let the data enforce it. This is ok for MLE, but for Bayesian sampling it is a problem, during MCMC the sampler might choose to make the log cumulative hazard go down, which is equivalent to a negative hazard and then the log hazard isn’t defined and the whole thing becomes a numerical disaster – the MCMC sampler will crash.
So we use M-splines on the hazard (M-splines are positive and always increase) or B-splines on the log hazard (can be negative, positive, go up or down). M-splines are easy to integrate to get the cumulative hazard, so they are faster. Whereas B-splines don’t integrate easily (we have to use numerical quadrature) so they are slower to fit the model. They are both “flexibile and wiggly”, so in most situations (except for some edge cases) the M-splines will be flexible enough and therefore preferable to the faster model fitting.
So in short, the M-spline/B-spline choice is for numerical reasons, since RP models are problematic with MCMC. All are trying to do something “flexibile and wiggly”, just on slightly different scales. So they should provide similar flexibility and results assuming you use a similar number of knots. There might be slight differences in the fit, and in some edge cases (e.g. a sharp change in the hazard early on or something) you might see one works better or worse, but with enough knots they should all do ok.