Running chains hangs after a few hours (rstanarm survival)

I have two laptops with relatively good specifications that I use for running Bayesian models, both have up-to-date software and correctly specified stan installation. Stan, brms and rstanarm models compile well. However, when I specify my rstanarm survival models with time-varying (tve()) argument, the running of chains hangs or never ends. No errors are shown.

What may be the problem and how to solve this?

Reproducible example, including data that will be downloaded with code and its continuous variables are already centered.

#data
library(tidyverse)
data = read_delim("https://www.dropbox.com/s/ufpmejd6do8saq6/shared_data.csv?dl=1", ";", escape_double = FALSE, trim_ws = TRUE) 

#model
library(rstanarm)
options(mc.cores = parallel::detectCores())
m1 = stan_surv(formula = Surv(time, status) ~ p1 + p2 + tve(p3) + p4, basehaz = "ms", basehaz_ops = list(df = 8), data = data)

PS! stan_surv() is a function from rstanarm package’s survival branch, following code should be used for its installation (GitHub - stan-dev/rstanarm: rstanarm R package for Bayesian applied regression modeling):

install.packages("rstanarm", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

if the last one didn’t work, causing R to crash, this one helps (Stan_surv crashes R repeatedly - #13 by binman):
remove.packages(c(“StanHeaders”, “rstan”))
install.packages(“StanHeaders”, repos = c(“Repository for distributing (some) stan-dev R packages | r-packages”, getOption(“repos”)))
install.packages(“rstan”, repos = c(“Repository for distributing (some) stan-dev R packages | r-packages”, getOption(“repos”)))

1 Like

Sorry, I forgot to mention that the CRAN version of rstanarm survival functionality is not yet available. I revised the original post.

Hi,
just to be clear - the hanging happens on both your computers?
Anyway, this is a bit hard to say. Before trying it on my computer, I would copy advice from [2011.01808] Bayesian Workflow which is IMHO highly relevant here:

For example, we recently received a query in the Stan users group regarding a multilevel logistic regression with 35,000 data points, 14 predictors, and 9 batches of varying intercepts, which failed to finish running after several hours in Stan using default settings from rstanarm.

We gave the following suggestions, in no particular order:

  • Simulate fake data from the model and try fitting the model to the fake data. Frequently a badly specified model is slow, and working with simulated data allows us not to worry about lack of fit.
  • Since the big model is too slow, you should start with a smaller model and build up from there. First fit the model with no varying intercepts. Then add one batch of varying intercepts, then the next, and so forth.
  • Run for 200 iterations, rather than the default (at the time of this writing). Eventually you can run for 2000 iterations, but there is no point in doing that while you’re still trying to figure out what’s going on. If, after 200 iterations, \widehat{R} is large, you can run longer, but there is no need to start with 2000.
  • Put at least moderately informative priors on the regression coefficients and group-level variance parameters.
  • Fit the model on a subset of your data. Again, this is part of the general advice to understand the fitting process and get it to work well, before throwing the model at all the data.

The common theme in all these tips is to think of any particular model choice as provisional, and to recognize that data analysis requires many models to be fit in order to gain control over the process of computation and inference for a particular applied problem.

Let us know if any of this sheds a bit more light on the problem, if not, I’ll give the model a try on my computer.

Best of luck!

2 Likes

Thank you Martin! This was really useful. I had to improve my workflow, starting to observe variables one-by-one and checked the sampling speed for each model. Finally, scaling (not only centering) one continuous variable helped.

1 Like