When trying something out on the colon
dataset from the R survival
package, I ran into trouble with initialization for a simple Weibull model. A-priori I expect the intercept to be about 7.5 or so (\approx log(5*365.25 days) = log of 5 years median survival for colon cancer has been mentioned in the literature). I think the problem is that initialization is using a N(0,1), which is very far away from 7.5. As a result sampling failed a lot with errors. Is the solution really to use an offset the way I did below? Is there some way to set an initial value for just one parameter (the intercept) and leave all the rest as N(0,1), which should work for everything else?
By the way, if I switched to Cox regression, is there an easy way to make it stratified (to reflect that the things I put into the formula for shape might impact the shape of the survival curve)?
library(tidyverse)
library(survival)
library(tidymodels)
library(brms)
data(cancer)
# Only model mortality (etype==2)
colon <- colon %>%
filter(etype==2) %>%
dplyr::select(-etype) %>%
mutate(id=as.integer(id))
# I have some pre-processing steps that make sense for various reasons
# (e.g. making continuous predictors approx. N(0,1) helps with interpreting priors)
preprocess_recipe <- recipe(status ~ ., data = colon) %>%
update_role(time, new_role = "outcome") %>%
update_role(id, study, new_role = "ID") %>%
step_indicate_na(all_predictors()) %>%
step_impute_median(all_numeric_predictors()) %>%
step_impute_mode(all_nominal_predictors()) %>%
step_sqrt(nodes) %>%
step_mutate(age = 125 - age) %>%
step_log(age) %>%
step_normalize(age, nodes) %>%
step_zv(all_predictors())
# Train recipe and apply it to all data at once (may do CV later)
myrecipe <- prep(preprocess_recipe, training = colon)
all_data <- bake(myrecipe, new_data = colon, everything())
model_weibull <- bf(time | cens(1-status) ~ 1 + factor(rx) + age + nodes + factor(extent) + surg + offset(7.5), #sex + obstruct + adhere + differ +
shape ~ 1 + rx + nodes,
family=weibull(link_shape = "log"))
prior_weibull <- prior(normal(0, 2.5), class=Intercept) +
prior(normal(0, 1), class=b) +
prior(normal(0, 1), class=Intercept, dpar=shape) +
prior(normal(0, 0.25), class=b, dpar=shape)
fit_weibull <- brm(model_weibull,
data=all_data,
prior=prior_weibull,
cores=4,
seed=234235)
- Operating System: x86_64, linux-gnu
- R version: R version 4.0.5 (2021-03-31)
- brms Version: 2.18.0