Error message when attempting Cox PH model

Hello,

I am attempting to run a Cox proportional hazards model using brms. I have been getting an error message that, after a few hours of trying to figure it out, I cannot solve. The error message reads:

Error in FUN(X[[i]], ...) : Stan does not support NA (in Zbhaz) in data
failed to preprocess the data; sampling not done

Here is the code used to specify the model:

abandon_glm <- brm(commitment_latency | cens(censored) ~ age_stndrd + location_string + first + gender_string,
                   data = abandon_latency,
                   family = brmsfamily("cox"),
                   iter = 5000,
                   warmup = 1500,
                   control = list(adapt_delta = 0.99, max_treedepth = 50),
                   save_pars = save_pars(all = TRUE),
                   cores = 4)

I constructed the censoring variable as follows:

abandon_latency <- wesch_data %>% 
 mutate(commitment_latency = ifelse(commitment_latency > 61, 61, commitment_latency),
     censored = ifelse(commitment_latency == 61, "right", "none"),
     commitment_latency = as.integer(commitment_latency)) %>% 
 relocate(censored, .after = commitment_latency) %>% 
 select(commitment_latency, censored, age_stndrd, location_string, first, gender_string, condition_string) 

I have come across a thread from a few years ago on this forum wherein someone had the same problem (Brms package for survival). However, the issue was at the time resolved by the individual updating their version of brms, which seems in my case to not be a possible solution given that my version of brms is up to date (brms 2.15.0; OS: Mac Catalina 10.15.7)?

Thank you for your attention to this post and any help offered. I am happy to provide more information on my problem if that may be of use.

1 Like

Hi,
some quick checks: do you have NAs anywhere in your data? Can you find a minimal subset of your data that cause the issue (e.g. by halving the dataset until you get rid of the error and then adding rows back until it reappears?) Can you share this minimal dataset with us?

Hope we can get it resolved quickly.

2 Likes

Hi Martin,

Thank you for offering help, it is greatly appreciated.

In the time since I posted the above message, I have come to figure out what the issue seems to have been. First, to answer your questions, I have no NAs in my dataset. Moreover, I have not checked on whether a minimal dataset would enable the Cox PH model to run absent my below-mentioned solution; it might work, and I can look into this minimal dataset route tomorrow.

The fix that seems to have worked for me is to adjust the latency values I assign to censored participants. As background, participants had a 60s window within which the target event could occur, and so their latency scores (“commitment_latency”) could range between 0 and 60. Initially (i.e., in the above code), I assigned participants who did not leave – and so had their latency scores right-censored – a score of 61. The issue seems to have been that this was equal to or greater in value than the latency score (60s) of the latest-responding participant who did not have their data censored. Thus, when I assigned censored participants a ceiling score (a latency score of 60), the same error message as above occurred. Rather, it was not until I gave the censored participants a commitment latency score of 59.999 that the issue was resolved and the model ran as expected.

However, unfortunately, now my issue is whether this is even a sensible thing to do, i.e., to assign censored participants a score (i.e., 59.999) that is less than the latest-responding participant (i.e., 60).

1 Like

OK, that’s pretty weird. Tagging @paul.buerkner if this is expected or a bug (it is likely at least problematic that you don’t get a more informative error message).

1 Like

thanks. I will take a look this week.

1 Like

Can you try with the latest github version of brms as well? If that still fails, could you please open an issue on github (GitHub - paul-buerkner/brms: brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan) and provide a minimial reproducible example of the problem you are facing?

1 Like

Hi Paul,

Thank you for the suggestion of trying the latest github version of brms (Version 2.15.5). After downloading the latest github version, the Cox models run as expected when I assign participants to whom the target event did not occur a maximum latency score (i.e., of 60s).

So, the issue that caused me to make this thread seems to have been resolved. Thanks!

For reference, the issue arose when I was using brms version 2.15.0.

Best,
Jared

1 Like