Brms: Right censored model is not converging, while a similar left censored model runs fine

Hello everyone and a happy new year! :)

Never posted here before, but I´ll try my best. I am fitting a rather simple Bayesian linear mixed-effects model ins brms (R version 4.4.2 and brms_2.23.0, on Win 11)

I am analysing monitoring data of animals, that tracks when individuals of a species a) appear for the first time in our research sites in a given year, and b) when they leave again for the winter.
The data is censored by the timing of when our monitoring started or ended that season, i.e. in some instances, some or even all individuals where already present in our study site (left centered data). Similarly, some individuals were still present when we left at the end of the season (right censored data). The actual censoring dates change between the different years, as monitoring stared/ended on different days each season.

Looking at the raw data, the distribution of arrival and departure dates vary significantly between years…sometimes I see strong unimodal peeks, when all individuals arrive/leave nearly at the same date. In other years, individuals arrive or departe across several weaks, or even in a bimodal patterns. Across all years, arrival data are generally right skewed, while departure data are left skewed. That was my rational for using the skew_normal distribution.

In a first step, I wanted to see if the arrival and departure dates changed across years. My response variables are ordinal date numbers. Using default priors, I ran this (with 1010 observations, across 262 IDs). As I have repeated measures for some individuals (but not all), I wanted to include the individual´s ID (tag_ID), so this was my model for the Arrival:

mod_arrival_year_effect <- brm (first_observation | cens(censored_start_monitoring) ~  
                                          as.numeric(year) + 
                                          (1|tag_id), 
                                          data = data_weather_arrival,
                                          family = skew_normal(),
                                          chains=4, cores=4, iter = 3000)

This model runs fine: Rhat, Bulk_ESS, Tail_ESS and look good. The posteriors check (with the warning that the censored responses are not included) don´t look perfect to me, but I am happy that the model is at least running. This model also give nearly identical results to a linear mixed effect model (estimated with lmer).

Similary, for the Departure I then ran this model (1217 observations, across 314 IDs):

mod_departure_year_effect <- brm (last_observation | cens(censored_end_monitoring) ~  
                                          as.numeric(year) + 
                                          (1|tag_id), 
                                          data = data_weather_departure,
                                          family = skew_normal(),
                                          chains=4, cores=4)

This was the default prior:

                  prior     class           coef  group resp dpar nlpar lb ub tag       source
           normal(0, 4)     alpha                                                      default
                 (flat)         b                                                      default
                 (flat)         b as.numericyear                                  (vectorized)
 student_t(3, 252, 8.9) Intercept                                                      default
   student_t(3, 0, 8.9)        sd                                        0             default
   student_t(3, 0, 8.9)        sd                tag_id                  0        (vectorized)
   student_t(3, 0, 8.9)        sd      Intercept tag_id                  0        (vectorized)
   student_t(3, 0, 8.9)     sigma                                        0             default

Here the summary:

 Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors. Family: skew_normal 
  Links: mu = identity 
Formula: last_observation  | cens(censored_end_monitoring) ~ as.numeric(year) + (1 | tag_id) 
   Data: data_weather_departure (Number of observations: 1217) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

 Multilevel Hyperparameters:
 ~tag_id (Number of levels: 314) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
 sd(Intercept)     1.60      1.81     0.14     4.61 4.08        4       11
 
 Regression Coefficients:
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
 Intercept         -5.20     13.60   -16.14    17.71 3.55        4       13
 as.numericyear     0.38      0.85    -1.01     1.13 3.70        4       13

 Further Distributional Parameters:
       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
 sigma    12.01      1.99    10.71    15.51 4.56        4       11 
 alpha     1.98      0.02     1.96     2.01 3.80        4       11

So…pretty bad. I am surprised that the intercept for theregression coefficients was estimated to be -5.20, when it should be somewhere around 250 (but maybe that is due to the low effective sample sizes and does not indicate anything at this stage?). I also got a lot of warnings during warmup (“Log probability evaluates to log(0), i.e. negative infinity”). Following suggestions I found online, I tried setting a stronger prior, the inits, delta and treedepth.
But I am not sure that is did this correctly:

init <- list(
  b_Intercept = 250,  
  b_as.numericyear = 0,  
  sd_tag_id = 1)
list_inits = list(init, init, init, init)

prior_1 <- c(set_prior("normal(0,10)", class = "b", coef = "as.numericyear"),
                 set_prior("normal(0,4)", class = "alpha"),
                  set_prior("student_t(3, 0, 5.9)", class = "sigma", lb = 0))


mod_departure_year_effect_brms_expanded <- brm(last_observation   | cens(censored_end_monitoring) ~  
              as.numeric(year) + 
              (1|tag_id) , 
              data = data_weather_departure,
              family = skew_normal(),
              prior = prior_1,
              init=  list_inits,  
              control = list(adapt_delta = 0.95, max_treedepth = 15),      
              chains=4, cores=4, iter = 4000)

I had tried setting a prior for a skew normal distribution (mu =250, alpha = -3, sigma = 10) but that did not work as intended.

Where did I go wrong?

I appreciate any help! (and if you need more information, please let me know).

All the best,
Carolin

PS: I looked at the log-transformed ordinal dates, but they are still skewed so a log_normal family would not fit well.

Happy new year to you too, and congratulations on your first post!

Yes, I wouldn’t worry about weird estimates from your second model. The fit is clearly not valid.

I’m just guessing, but might the large differences in scale in your variables be an issue here? You don’t tell so much about what your data actually looks like, but if it’s in counts of days, the left-censored data are perhaps much closer to zero?

I think I’d try to just scale the variables differently? What are you hoping to learn from the model? Letting the research question inform scaling is often helpful, I find - estimates are easier to interpret.

Hi @Kara!

Could you show us a few rows of the data? My hunch is the problem is not where you are looking for it.

1 Like

Just adding that Bulk-ESS of 4 with 4 chains indicates all chains are stuck locally. I agree that showing at least a bit of data is good next step

Hi everyone,

thanks for the replies!
I want to see if arrival or departures dates of my animals have changed across years (and if so, in which direction). So if arrival times have, for example, significantly advanced across years.

This what my data looks like.
All dates such as first_observation, star_monitoring are coded as ordinal dates (so the number of day in the year). Median of the left centered data is around 150, median of the right centered data around 250.

year tag_id start_monitoring first_observation censored_start_monitoring end_mon last_observation censored_end_monitoring
2021 ID_1 113 113 left 256 240 none
2021 ID_2 113 130 none 256 242 none
2021 ID_3 113 NA NA 256 256 right
2021 ID_4 113 135 none 256 NA NA
2021 ID_5 113 141 none 256 240 none
2021 ID_6 113 113 left 256 242 none
2022 ID_1 120 120 left 260 198 none
2022 ID_2 120 NA NA 260 NA NA
2022 ID_3 120 129 none 260 260 right

I then filter this dataset into two subsets, one only for arrival, one only for departure. In total around 11% of the arrival data are left censored, and 14% of departure dates are right censored.

I did not expect scaling to have a huge effect, as the only numerical predictor is the year and the model fitting the arrival date works fine. But I can certainly give rescaling a try!

Best,

Carolin

PS: In a next step I then want to see what parameters (e,g, sex, age, etc.) influence my arrival and departure dates. Here I will certainly have different scales, but first models with centered numerical variables did again not converge for the departure mode (but converged for arrival data)l. But I figured I start looking for help on this “simple” model including only the year first…

Is the issue specific to censoring? What happens if you model last observations excluding the censored data?

Hmm, this is tough. My first two thoughts were setting custom init values, and centering your predictors (especially year). But it looks like you’ve already done both.

@Solomon Yes, unfortunately centering year + custom init is still not working.

@amynang That was a good hunch! It is indeed specific to censoring!
Following your suggestion, I ran models only using a) all uncensored data, and b) all censored data. The completely for the uncesored data runs fine (even without custom inits), the fully censored data is again stuck.

I found this bug report, where someone also encountered issues fitting models with a right-censored normal distribution, using brms, while the left-censored worked fine:

I am definitely not good enough to understand the Stan code :D

Do you guys think that might be also the reason, why mine is not working?
And/or any suggestions on how to proceed, now that we know it is specific to censoring?

It’s above my skill set at this point.

1 Like

I am rusty, so don’t listen to me. But some thoughts. At this point, just go with learning Stan code. BRMS is a one liner in R that generates the Stan code.

The documentation is here, there’s a section on how to implement models with censored data in the documentation: chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://mc-stan.org/docs/2_38/stan-users-guide-2_38.pdf. It’s pretty straight forward.

And yes, rescaling (like (x- mean)/sd), or x / max(x) for x when x \in [0, inf)) will probably help, and you can just un-rescale afterward, just save which scalings you’ve used for which covariate.

And then, are you sure the priors and likelihood you’re using are good for what you’re trying to model? Take a look at literature on how people are modeling arrival times and then go from there. Make sure the assumptions of the model you use match your data… idk I haven’t looked into it much, but i.e. Poisson process inter arrival times follow exponential, so for example you could compute a distribution of interarrival times and see if if it roughly follows an exponential distribution (and if you really want, there are tests, K-S, you could simulate from an exponential and then compare with your distribution of arrival times)… but the Poisson process is assuming one thing is coming at a time, in a sequence, but animals may be leaving at the same time, no idea! That’s where I would head with it. I don’t know if it’s a bug may be your model just doesn’t really fit your data.