I have a data set that contains repeated measures of subjects at many times (frequency consistently at about 100 Hz) that lead to a certain event where the subjects reacted (the “death” in my case, speaking in survival-analysis terms). The subjects always reacted, it was just a matter when/under what conditions. I have a set of covariates that I assume affect the reaction timing which are mostly (slowly but consistently) time varying. Moreover, subjects had about 10 repetitions of the same experiment, which is a bit off from classical survival settings, I suppose.
So I thought that the best match for this problem is fitting a survival model with time-varying covariates and a random/group-level-effect term that accounts for repeated subjects in the data. I did not really find any matching frequentist implementations that I was able to understand, so I reverted to Bayesian models for now.
I first tried out rstanarm::survival
to fit a survival model like this:
fit.rstanarm <- stan_surv(formula = Surv(tStart, tStop, event) ~ a + b + (1|subject), ...)
Then I came across an implementation of a discrete-time survival model via Bayesian logistic regression by @Solomon using brms
: 12 Extending the Discrete-Time Hazard Model | Applied longitudinal data analysis in brms and the tidyverse
I tried that model out like this:
fit.brms <- brm(formula = event | trials(1) ~ 0 + Intercept + a + b + (1|subject), ...)
What I care most about in the end is being able to predict the survival probability, or actually the probability of a subject to have the event, i.e., 1 - survival probability, given the covariate values at that time.
The two implementations actually yield similar distributions for the parameters a
and b
. They also predict similar survival probabilities, see below a plot of an example experiment where I plotted a
on the x-axis:
The survival probabilities look very similar, however, the hazard rates seem to be on different scales.
My question(s):
Are these type of models related, and if so, how? Are they just different by their assumptions on the baseline hazard? Is any of them favorable for my problem setting? Does it make sense to fit both of them and then evaluate the better one, for instance, in terms of time-dependent AUC and Brier score?
- Operating System: Ubuntu 20.04
- rstanarm Version: 2.21.2
- brms Version: 2.16.3