Survival analysis with right censored and interval censored data with brms

fsdias · October 1, 2021, 3:32pm

Hi,

I’m working on a project that aims to assess the survival of oak seedlings under different conditions.

We planted a few hundred acorns under different types of cover (shrub cover, tree cover, open areas etc.). Many acorns never germinated, but some grew into seedlings. We visited the plots periodically to check if the seedlings were still alive and recorded the number of days the seedlings survived. After a while, we ran out of funding, so we had to visit the sites less often than we would have liked. This means the visits were further and further apart towards the end of the study.

I believe the dataset has two types of censoring, interval censoring and right censoring.

Interval censoring - when we found a seedling alive at “t1” but dead at “t2”, we must assume that it died at some point between “t1” and “t2”.

Right censoring - because some seedlings managed to survive until the end of the study

I’m trying to run the analysis with “brms”, but I’m struggling with the syntax.

This is what the dataset looks like:

days_to_event_alive - number of of days between germination and the last time the seedling was found alive.

days_to_event_death - number of of days between germination and the time the plant was found dead.

Event - cause of death

cens_int - my attempt at creating a censoring column

I wrote the following brms code

fit = brm(days_to_event_alive|cens(cens_int,days_to_event_death) ~ 0+cover_type, data = df,
          prior(normal(0, 10), class = b),
          family = exponential,chains = 4, cores = 4)

and I have the following questions:

are my data interval censored and right censored?
did I assign the right codes to each type of censoring?
is the brms code right? It runs with the full dataset, but it returns “Log probability evaluates to log(0), i.e. negative infinity.” I guess it’s either because the model is not parameterized correctly or I need to set adequate priors. I’m not sure which.

Thanks in the advance for helping me.

Example data:

df<-structure(list(days_to_event = c(1069, 345, 1003, 309, 375, 246, 
1033, 345, 1003, 282, 411, 73, 67, 967, 543, 1033, 1033, 723, 
279, 1033, 753, 348, 1069, 1069, 345, 177, 1033, 164, 411, 411, 
128, 1069, 164, 348, 309, 1033, 753, 1033, 753, 753, 411, 375, 
243, 279, 1033, 213, 411, 723, 375, 513, 543, 1033, 375, 194, 
309, 723, 1003, 579, 1033, 348, 651, 210, 753, 375, 967, 579, 
477, 789, 1069, 375, 279, 1033, 309, 1033, 312, 103, 375, 345, 
753, 213, 1033, 789, 312, 1069, 279, 1033, 1033, 348, 177, 375, 
279, 279, 282, 753, 375, 309, 309, 753, 753, 411), days_to_event_alive = c(1069, 
282, 1003, 246, 312, 177, 1033, 282, 1003, 213, 348, 37, 31, 
687, 375, 753, 753, 513, 230, 1033, 543, 279, 1069, 1069, 282, 
128, 753, 129, 348, 348, 93, 789, 129, 279, 246, 1033, 543, 1033, 
543, 543, 348, 312, 194, 230, 1033, 164, 348, 513, 312, 345, 
375, 1033, 312, 159, 246, 513, 723, 411, 1033, 279, 441, 141, 
543, 312, 967, 411, 309, 579, 1069, 312, 230, 1033, 246, 1033, 
243, 67, 312, 282, 543, 164, 1033, 579, 243, 789, 230, 1033, 
1033, 279, 128, 312, 230, 230, 213, 543, 312, 246, 246, 543, 
543, 348), days_to_event_death = c(1069, 345, 1003, 309, 375, 
246, 1033, 345, 1003, 282, 411, 73, 67, 967, 543, 1033, 1033, 
723, 279, 1033, 753, 348, 1069, 1069, 345, 177, 1033, 164, 411, 
411, 128, 1069, 164, 348, 309, 1033, 753, 1033, 753, 753, 411, 
375, 243, 279, 1033, 213, 411, 723, 375, 513, 543, 1033, 375, 
194, 309, 723, 1003, 579, 1033, 348, 651, 210, 753, 375, 967, 
579, 477, 789, 1069, 375, 279, 1033, 309, 1033, 312, 103, 375, 
345, 753, 213, 1033, 789, 312, 1069, 279, 1033, 1033, 348, 177, 
375, 279, 279, 282, 753, 375, 309, 309, 753, 753, 411), cover_type = c("open", 
"open", "diverse", "salvi", "open", "open", "open", "open", "open", 
"open", "open", "salvi", "open", "ulex", "open", "open", "open", 
"salvi", "open", "diverse", "open", "open", "diverse", "diverse", 
"ulex", "open", "esteva", "open", "open", "open", "open", "diverse", 
"diverse", "open", "ulex", "open", "salvi", "open", "esteva", 
"salvi", "salvi", "salvi", "open", "open", "diverse", "open", 
"open", "open", "salvi", "open", "open", "open", "open", "open", 
"salvi", "open", "esteva", "open", "open", "diverse", "salvi", 
"diverse", "salvi", "open", "open", "diverse", "open", "open", 
"open", "open", "salvi", "open", "open", "diverse", "open", "salvi", 
"salvi", "open", "ulex", "salvi", "salvi", "diverse", "open", 
"open", "diverse", "salvi", "open", "open", "open", "open", "diverse", 
"open", "salvi", "open", "open", "open", "open", "open", "open", 
"open"), Event = c("End of study", "Dried", "End of study", "Dried", 
"Dried", "Dried", "End of study", "Dried", "End of study", "Dried", 
"Dried", "Unknown", "Unknown", "Unknown", "Dried", "Dried", "Dried", 
"Dried", "Dried", "End of study", "Dried", "Dried", "End of study", 
"End of study", "Unknown", "Dried", "Unknown", "Dried", "Dried", 
"Dried", "Dried", "Dried", "Dried", "Dried", "Unknown", "End of study", 
"Dried", "End of study", "Dried", "Dried", "Dried", "Dried", 
"Dried", "Dried", "End of study", "Dried", "Dried", "Dried", 
"Dried", "Dried", "Unknown", "End of study", "Dried", "Dried", 
"Dried", "Dried", "Unknown", "Dried", "End of study", "Dried", 
"Dried", "Dried", "Dried", "Dried", "End of study", "Unknown", 
"Dried", "Dried", "End of study", "Unknown", "Dried", "End of study", 
"Dried", "End of study", "Dried", "Unknown", "Dried", "Dried", 
"Dried", "Dried", "End of study", "Dried", "Dried", "Unknown", 
"Dried", "End of study", "End of study", "Dried", "Destroyed by wildboars", 
"Dried", "Dried", "Dried", "Dried", "Dried", "Dried", "Dried", 
"Dried", "Dried", "Dried", "Dried"), cens_int = c("right", "interval", 
"right", "interval", "interval", "interval", "right", "interval", 
"right", "interval", "interval", "interval", "interval", "interval", 
"interval", "interval", "interval", "interval", "interval", "right", 
"interval", "interval", "right", "right", "interval", "interval", 
"interval", "interval", "interval", "interval", "interval", "interval", 
"interval", "interval", "interval", "right", "interval", "right", 
"interval", "interval", "interval", "interval", "interval", "interval", 
"right", "interval", "interval", "interval", "interval", "interval", 
"interval", "right", "interval", "interval", "interval", "interval", 
"interval", "interval", "right", "interval", "interval", "interval", 
"interval", "interval", "right", "interval", "interval", "interval", 
"right", "interval", "interval", "right", "interval", "right", 
"interval", "interval", "interval", "interval", "interval", "interval", 
"right", "interval", "interval", "interval", "interval", "right", 
"right", "interval", "interval", "interval", "interval", "interval", 
"interval", "interval", "interval", "interval", "interval", "interval", 
"interval", "interval")), row.names = c(NA, -100L), class = c("tbl_df", 
"tbl", "data.frame"))

StaffanBetner · October 5, 2021, 9:13pm

The answer is yes to all questions. The initialization problem does not tend to be a problem if it runs without any bad diagnostics. The sampler just have a little problem finding valid starting points, but finds them quite fast anyway.

l-psych · November 29, 2022, 3:25pm

Hi!

I have a very similar question as the one from fsdias, but I am not a 100% sure I am applying the solution from @StaffanBetner the right way. Also, I would like to use this survival model in a different way.

I am interested in predicting psychopathology development over time using survival parameters from the survival model.
I have data collected using ESM in daily life during which individuals reported whether they had stressful events and reported their stress for the rest of the day. My overall goal is to find out whether the time to recovery from stressful events can predict psychopathology development. My analysis plan therefore was as follows:

run a survival model to subtract coefs representing individual survival parameters
use these individual survival parameters as predictors in the model predicting psychopathology over time in the form of : psychopathology ~ time*survival_coefs (of course also including all relevant random effects)

The dataset looks like this:
status (1 = recovered to baseline level of stress / 0 = not recovered within followup time)
time1 and time2: when status = 1 (recovered): Time 1 and time 2 are interval in which recovery has happened.When status = 0 (not recovered) then time1 is the amount of follow-up time during which recovery did not occur
SA.b.s is the level of stress at the time of the event (indicator of how stressful the event was
cens is the censoring variabele I created for the brms model

my data is, if I am not mistaken, both right-censored (not everyone recovers in the follow-up time) and interval censored (we know recovery happened somewhere between time1 and time2, but are not sure exactly which minute).

I am not sure whether I created the censoring variable in the right way. I defined it now as 0=observed recovery and 1=censored (right censored). See sample of the data below:

I formulated my brms syntax as follows:
fit1 ← brm(time1 | cens(cens, time2) ~ SA.b.s + (1 + SA.b.s |ID)

My questions are:

does the analysis plan make sense?
did I create the censoring variable in the right way?
is the brms code right? especially with regards to the censoring?

Thanks so much in advance!

fsdias · November 29, 2022, 5:58pm

Hi,

I assume you’re trying to split the analysis over two models to work around the fact you have a censored covariate?

If time to recovery wasn’t censored you would probably be working with this model, right?

Psychopathology development ~ Bernoulli(p)

logit(p) ~ a + B1*Time_to_recovery + B2[SAbs]

I’ve never worked with models with censored independent variables, but I suspect that’s possible. See:

and

https://cran.r-project.org/web/packages/brms/vignettes/brms_missings.html

If I’m not mistaken, this means that you’d be treating the censored values as missing data, replacing the missing data with parameters and then estimate those parameters using an imputation method method.

Let’s see if someone more qualified than me can weigh in on this.

l-psych · November 29, 2022, 6:54pm

Thanks so much for your response!
I have indeed focused on survival analysis due to the censored predictor issue. It is a work around. I have found some pages where suggestions were given to handle this with imputation (e.g. Censored predictors · Issue #565 · paul-buerkner/brms · GitHub), but I’m not sure it’s the best way to go… Paul Buerkner commented there that it may be something to add to brms’s functionalities in the future. @paul.buerkner do you maybe have any suggestions?
Thanks @fsdias for your response!

paul.buerkner · November 30, 2022, 2:57pm

brms does not yet support censored predictors and this topic is a bit outside my expertise that I cannot really suggest any useful method for this right now. Sorry :-(

l-psych · November 30, 2022, 3:52pm

Thanks for your quick reply @paul.buerkner! No problem, than I will just use this workaround. Does subtracting the coef’s from the brms model to then use them as predictors in the second model sound logical to you? Or do you know anyone who may have expertise on this? Thanks a lot in advance.

mebrooks · June 26, 2025, 3:39pm

Hi @fsdias, I followed your example above to do my own analysis of the time to an event with right censored and interval censored data. Then I looked for the publication to cite and it looks like you used a Weibull distribution and the cmdstanr package. So I’m wondering if I need to also redo my analyses. Can you explain your reasons for the changes? Thanks in advance.

fsdias · July 1, 2025, 7:59am

I assume you’re referring to this paper?

I prefer to work with Stan directly (without interfaces or wrappers), and cmdstanr is my tool of choice. That said, you could certainly fit the same model using brms if you prefer. I opted for a Weibull distribution instead of an exponential because it provided a better fit to the data.

Topic		Replies	Views
Using workaround to use censored predictor time-to-event data brms rstan	3	826	April 24, 2023
Brms: Right censored model is not converging, while a similar left censored model runs fine Modeling fitting-issues , brms	9	277	January 19, 2026
Debug model with interval and right censoring brms	7	1368	February 13, 2020
Mixed (right, left and interval) censored log-normal with brms brms fitting-issues , specification	1	2342	June 17, 2022
How to validate a survival model with both right censoring and interval censoring? Modeling ecology	14	2780	February 25, 2022

Survival analysis with right censored and interval censored data with brms

Related topics