Proposal for moving stan_surv() to a separate package

Hello All,

I would like to propose moving stan_surv() (Survival functionality) from rstanarm to a separate package, say SurvivalStan.

The reasons are as follows:

  • At this point, it seems unlikely that the survival branch in rstanarm will ever make it into main, which is limiting its use in pharma and other areas that rely on survival modeling.

  • Survival analysis is a fairly specialized area, and people who do it tend to live in those models and packages.

  • Recent guidance from the FDA on the use of Bayesian methods in clinical trials is giving Bayes some tailwinds, and it would be a shame not to take advantage of them.

  • Recent developments in AI make it particularly well-suited to tasks of this kind – I, together with my friend Claude, created this migration plan here to demonstrate. This is not meant to be the target repository; it’s just a demo.

I have spoken to @jackinovik (she was easy to find), @ermeel and @sambrilleman, who wrote the paper and did the original work, and they are all on board, although Sam will unlikely have time to contribute. In addition, my former student, Adeeba Tak, is eager to help and contribute.

One thing I want to make clear is that this is not an indictment of rstanarm proper. It’s a great package, I reach for it when I need to fit a quick model, and use it in my classes at NYU. I also recognize that the inability to push this branch to CRAN has nothing to do with rstanarm developers and has everything to do with CRAN limitations.

I am posting this here as an RFC (remember those?) If anyone has strong objections and can articulate them, we would love to hear them. Perhaps there are alternative approaches, like say moving the functionality to brms. However, I think there is a lot of value for a turnkey, precompiled (or compiled on install) solution, as rstanarm itself has demonstrated. Also, if there are others who want to work on it and contribute, please let us know.

4 Likes

I’d like to second this; it makes sense, especially in light of the recent Bayesian FDA guidance!

I’m definitely in favor of making this functionality more widely available and I don’t really have any strong objections to moving it outside of rstanarm, but here are a few things that are worth thinking about and discussing before making a decision:

-It’s conceivable we could get this onto CRAN as part of rstanarm now (past comments from Rok and Ben in various rstanarm issues suggested the original hurdles are probably not a problem anymore). That said there’s always some chance CRAN would find another problem that we haven’t considered, so I can’t guarantee it.

-My understanding was that the main reason it’s not already in rstanarm is just that nobody was working on it. Sam explained most of the remaining tasks to be done in Include `stan_surv` in CRAN release of rstanarm? · Issue #570 · stan-dev/rstanarm · GitHub , but he didn’t have the bandwidth to work on it and I don’t think anyone else picked up those remaining tasks he mentioned (correct me if I’m wrong about that, it’s been a while!). But it sounds like there’s now someone interested in working on it.

-One of the main reasons this was originally included in rstanarm and not in a separate package is Sam’s comment here: Include `stan_surv` in CRAN release of rstanarm? · Issue #570 · stan-dev/rstanarm · GitHub :

I think it might be a bit hard to maintain as an entirely separate package, including the refactoring to work out how to avoid having to duplicate all the external postestimation and/or internal util functions in both packages!

Is there a plan for getting around this or is the plan just to bite the bullet, so to speak? Is survivalstan/MIGRATION_PLAN.md at main · ericnovik/survivalstan · GitHub really the full list of rstanarm functions used (aside from the rstanarm Stan functions that are #included in surv.stan)? That list is much shorter than I expected, but it’s been a while since I looked at any of the survival code so it could be right. As long as copyright / authorship is handled properly you can certainly copy all the rstanarm code you want and use it in a new package, just curious if that was the plan or if you have a different plan?

-The Claude plan mentions using cmdstanr instead of rstan/rstantools. You can use Pre-Compiled CmdStan Models in R Packages • instantiate for this, but there’s no way that I know of to avoid having to compile the models during installation. They’re reused after initial compiling, so you get around recompiling every time, but, unless I’m mistaken the user still needs a proper toolchain setup because models are compiled during installation. Not needing the toolchain setup was one of the motivations for rstanarm. If the intended users for the survival modeling are used to setting up C++ toolchains then this isn’t an issue, just wanted to mention it in case it’s relevant. If you want rstanarm-like pre-compilation you could still make a separate package via rstantools.

Anyway, I’m glad there’s interest in reviving this whether it’s in rstanarm or a separate package!

2 Likes

It seems you can do this in the r-universe platform at least to some extent - I recently downloaded nutpieR and was able to run it without installing a Rust toolchain (nutpieR: R Bindings for the Nutpie NUTS Sampler)

1 Like

There’s also a branch GitHub - stan-dev/rstanarm at survival-rstantools · GitHub made by @andrjohns, which seems like it made some progress moving things along a bit and worked on compatibility with rstan 2.36. That branch may be the right place to start if resuming work and maybe Andrew could let you know how he left things when he stopped working on it if you want to continue it. Or if you decide you want to move stan_surv into a separate package, that branch could still be a good place to start.

1 Like

-It’s conceivable we could get this onto CRAN as part of rstanarm now (past comments from Rok and Ben in various rstanarm issues suggested the original hurdles are probably not a problem anymore). That said there’s always some chance CRAN would find another problem that we haven’t considered, so I can’t guarantee it.

This would be the path of least resistance. Do you know who was the last person who tried putting it in?

Is there a plan for getting around this or is the plan just to bite the bullet, so to speak? Is survivalstan/MIGRATION_PLAN.md at main · ericnovik/survivalstan · GitHub really the full list of rstanarm functions used (aside from the rstanarm Stan functions that are #included in surv.stan)? That list is much shorter than I expected, but it’s been a while since I looked at any of the survival code so it could be right. As long as copyright / authorship is handled properly you can certainly copy all the rstanarm code you want and use it in a new package, just curious if that was the plan or if you have a different plan?

If we do move it, I think we will have to do all that, assuming we can get someone to commit to doing the maintenance. In terms of the function list, we will have to do a more careful review, and, of course, the package would have to go under GPL if we copy rstanarm code.

-The Claude plan mentions using cmdstanr instead of rstan/rstantools. You can use Pre-Compiled CmdStan Models in R Packages • instantiate for this, but there’s no way that I know of to avoid having to compile the models during installation.

That’s my understanding as well.

Do you know who was the last person who tried putting it in?

By putting it in do you mean trying to submit it to CRAN? I don’t think there was ever an actual attempt because the remaining tasks to be finished (as described in Sam’s comment) were never finished. The reason we didn’t think CRAN would accept it back then was because of known CRAN policies, but those issues seem to be surmountable now based on comments from Rok and Ben. Rok said:

If I remember correctly, the issue was that the produced assembly was too big for 32-bit Windows, which meant that CRAN could not build binaries for 32-bit Windows.

However, CRAN has since dropped the requirement to support 32-bit Windows once R 4.1 was released. So we might be able to get this on CRAN now.

And Ben said:

The 32bit thing was an issue, although we had always been able to barely get around it with rstanarm. The bigger deal for CRAN was the compilation time and RAM especially on Windows, but that seems to have been lessened by the additions to src/Makevars.win that utilize LTO. It does not take so long to compile any one model into the intermediate representation and both that and the linking / optimization can be done in parallel. And it doesn’t seem to have broken anything. So, I think we can start to add Stan programs, starting with the survival and CAR models that have been sitting around for ages.

I know that, at least right now, neither Ben nor I have the time to work on this, but if you have someone who wants to work on the remaining tasks in rstanarm then we could certainly try submitting it to CRAN when it’s ready and see if they accept it.

1 Like

Cool, thanks, let me ask Adeeba if she would be up for it.