Brms: Right censored model is not converging, while a similar left censored model runs fine

Kara · January 2, 2026, 6:04pm

Hello everyone and a happy new year! :)

Never posted here before, but I´ll try my best. I am fitting a rather simple Bayesian linear mixed-effects model ins brms (R version 4.4.2 and brms_2.23.0, on Win 11)

I am analysing monitoring data of animals, that tracks when individuals of a species a) appear for the first time in our research sites in a given year, and b) when they leave again for the winter.
The data is censored by the timing of when our monitoring started or ended that season, i.e. in some instances, some or even all individuals where already present in our study site (left centered data). Similarly, some individuals were still present when we left at the end of the season (right censored data). The actual censoring dates change between the different years, as monitoring stared/ended on different days each season.

Looking at the raw data, the distribution of arrival and departure dates vary significantly between years…sometimes I see strong unimodal peeks, when all individuals arrive/leave nearly at the same date. In other years, individuals arrive or departe across several weaks, or even in a bimodal patterns. Across all years, arrival data are generally right skewed, while departure data are left skewed. That was my rational for using the skew_normal distribution.

In a first step, I wanted to see if the arrival and departure dates changed across years. My response variables are ordinal date numbers. Using default priors, I ran this (with 1010 observations, across 262 IDs). As I have repeated measures for some individuals (but not all), I wanted to include the individual´s ID (tag_ID), so this was my model for the Arrival:

mod_arrival_year_effect <- brm (first_observation | cens(censored_start_monitoring) ~  
                                          as.numeric(year) + 
                                          (1|tag_id), 
                                          data = data_weather_arrival,
                                          family = skew_normal(),
                                          chains=4, cores=4, iter = 3000)

This model runs fine: Rhat, Bulk_ESS, Tail_ESS and look good. The posteriors check (with the warning that the censored responses are not included) don´t look perfect to me, but I am happy that the model is at least running. This model also give nearly identical results to a linear mixed effect model (estimated with lmer).

Similary, for the Departure I then ran this model (1217 observations, across 314 IDs):

mod_departure_year_effect <- brm (last_observation | cens(censored_end_monitoring) ~  
                                          as.numeric(year) + 
                                          (1|tag_id), 
                                          data = data_weather_departure,
                                          family = skew_normal(),
                                          chains=4, cores=4)

This was the default prior:

                  prior     class           coef  group resp dpar nlpar lb ub tag       source
           normal(0, 4)     alpha                                                      default
                 (flat)         b                                                      default
                 (flat)         b as.numericyear                                  (vectorized)
 student_t(3, 252, 8.9) Intercept                                                      default
   student_t(3, 0, 8.9)        sd                                        0             default
   student_t(3, 0, 8.9)        sd                tag_id                  0        (vectorized)
   student_t(3, 0, 8.9)        sd      Intercept tag_id                  0        (vectorized)
   student_t(3, 0, 8.9)     sigma                                        0             default

Here the summary:

 Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors. Family: skew_normal 
  Links: mu = identity 
Formula: last_observation  | cens(censored_end_monitoring) ~ as.numeric(year) + (1 | tag_id) 
   Data: data_weather_departure (Number of observations: 1217) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

 Multilevel Hyperparameters:
 ~tag_id (Number of levels: 314) 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
 sd(Intercept)     1.60      1.81     0.14     4.61 4.08        4       11
 
 Regression Coefficients:
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
 Intercept         -5.20     13.60   -16.14    17.71 3.55        4       13
 as.numericyear     0.38      0.85    -1.01     1.13 3.70        4       13

 Further Distributional Parameters:
       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
 sigma    12.01      1.99    10.71    15.51 4.56        4       11 
 alpha     1.98      0.02     1.96     2.01 3.80        4       11

So…pretty bad. I am surprised that the intercept for theregression coefficients was estimated to be -5.20, when it should be somewhere around 250 (but maybe that is due to the low effective sample sizes and does not indicate anything at this stage?). I also got a lot of warnings during warmup (“Log probability evaluates to log(0), i.e. negative infinity”). Following suggestions I found online, I tried setting a stronger prior, the inits, delta and treedepth.
But I am not sure that is did this correctly:

init <- list(
  b_Intercept = 250,  
  b_as.numericyear = 0,  
  sd_tag_id = 1)
list_inits = list(init, init, init, init)

prior_1 <- c(set_prior("normal(0,10)", class = "b", coef = "as.numericyear"),
                 set_prior("normal(0,4)", class = "alpha"),
                  set_prior("student_t(3, 0, 5.9)", class = "sigma", lb = 0))


mod_departure_year_effect_brms_expanded <- brm(last_observation   | cens(censored_end_monitoring) ~  
              as.numeric(year) + 
              (1|tag_id) , 
              data = data_weather_departure,
              family = skew_normal(),
              prior = prior_1,
              init=  list_inits,  
              control = list(adapt_delta = 0.95, max_treedepth = 15),      
              chains=4, cores=4, iter = 4000)

I had tried setting a prior for a skew normal distribution (mu =250, alpha = -3, sigma = 10) but that did not work as intended.

Where did I go wrong?

I appreciate any help! (and if you need more information, please let me know).

All the best,
Carolin

PS: I looked at the log-transformed ordinal dates, but they are still skewed so a log_normal family would not fit well.

erognli · January 3, 2026, 10:35am

Happy new year to you too, and congratulations on your first post!

Yes, I wouldn’t worry about weird estimates from your second model. The fit is clearly not valid.

I’m just guessing, but might the large differences in scale in your variables be an issue here? You don’t tell so much about what your data actually looks like, but if it’s in counts of days, the left-censored data are perhaps much closer to zero?

I think I’d try to just scale the variables differently? What are you hoping to learn from the model? Letting the research question inform scaling is often helpful, I find - estimates are easier to interpret.

amynang · January 3, 2026, 6:15pm

Hi @Kara!

Could you show us a few rows of the data? My hunch is the problem is not where you are looking for it.

avehtari · January 4, 2026, 4:10pm

Just adding that Bulk-ESS of 4 with 4 chains indicates all chains are stuck locally. I agree that showing at least a bit of data is good next step

Kara · January 6, 2026, 6:25pm

Hi everyone,

thanks for the replies!
I want to see if arrival or departures dates of my animals have changed across years (and if so, in which direction). So if arrival times have, for example, significantly advanced across years.

This what my data looks like.
All dates such as first_observation, star_monitoring are coded as ordinal dates (so the number of day in the year). Median of the left centered data is around 150, median of the right centered data around 250.

year	tag_id	start_monitoring	first_observation	censored_start_monitoring	end_mon	last_observation	censored_end_monitoring
2021	ID_1	113	113	left	256	240	none
2021	ID_2	113	130	none	256	242	none
2021	ID_3	113	NA	NA	256	256	right
2021	ID_4	113	135	none	256	NA	NA
2021	ID_5	113	141	none	256	240	none
2021	ID_6	113	113	left	256	242	none
2022	ID_1	120	120	left	260	198	none
2022	ID_2	120	NA	NA	260	NA	NA
2022	ID_3	120	129	none	260	260	right

I then filter this dataset into two subsets, one only for arrival, one only for departure. In total around 11% of the arrival data are left censored, and 14% of departure dates are right censored.

I did not expect scaling to have a huge effect, as the only numerical predictor is the year and the model fitting the arrival date works fine. But I can certainly give rescaling a try!

Best,

Carolin

PS: In a next step I then want to see what parameters (e,g, sex, age, etc.) influence my arrival and departure dates. Here I will certainly have different scales, but first models with centered numerical variables did again not converge for the departure mode (but converged for arrival data)l. But I figured I start looking for help on this “simple” model including only the year first…

amynang · January 8, 2026, 6:00pm

Is the issue specific to censoring? What happens if you model last observations excluding the censored data?

Solomon · January 8, 2026, 7:29pm

Hmm, this is tough. My first two thoughts were setting custom init values, and centering your predictors (especially year). But it looks like you’ve already done both.

Kara · January 16, 2026, 3:22pm

@Solomon Yes, unfortunately centering year + custom init is still not working.

@amynang That was a good hunch! It is indeed specific to censoring!
Following your suggestion, I ran models only using a) all uncensored data, and b) all censored data. The completely for the uncesored data runs fine (even without custom inits), the fully censored data is again stuck.

I found this bug report, where someone also encountered issues fitting models with a right-censored normal distribution, using brms, while the left-censored worked fine:

github.com/stan-dev/math

Errors when right-censoring: bug in normal_lccdf?

opened 05:21PM - 14 Nov 22 UTC

simoncolumbus

I've been running into issues fitting models with a right-censored normal distri…bution, using `brms`. I've submitted an issue with `brms` here: https://github.com/paul-buerkner/brms/issues/1423 However, it appears that the error does not arise from `brms`. @paul-buerkner suggested it might instead arise from a bug in `normal_lccdf`, so I thought I'd post it here. I appears that the issue occurs only * using the `gaussian` family; others such as `student` or `weibull` work fine; * when right-censoring, whereas left-censoring works fine (suggesting `normal_lccdf` as the culprit); * quite recently (as others report running such models a few months back without issue). Below is my original post to the brms issue tracker, containing a MWE. Apologies for not testing this with other Stan interfaces. --- ```r # Censoring example set.seed(25) n <- 5000 t1 <- 106 # Right censoring cutoff t2 <- 94 # Left censoring cutoff d <- tibble(y = rnorm(n, mean = 100, sd = 15), x = rnorm(n, 0, 1)) %>% # Right censored variables mutate(y1 = if_else(y > t1, t1, y), cen1 = if_else(y > t1, "right", "none")) %>% # Left censored variables mutate(y2 = if_else(y < t2, t2, y), cen2 = if_else(y < t2, "left", "none")) # Right censoring fit <- brm(bf(y1 | cens(cen1) ~ x), data = d) Compiling Stan program... Start sampling SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1). Chain 1: Rejecting initial value: Chain 1: Log probability evaluates to log(0), i.e. negative infinity. Chain 1: Stan can't start sampling from this initial value. ... Chain 1: Initialization between (-2, 2) failed after 100 attempts. Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model. [1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed." [1] "error occurred during calling the sampler; sampling not done" Family: gaussian Links: mu = identity; sigma = identity Formula: y1 | cens(cen1) ~ x Data: d (Number of observations: 5000) ``` This occurs across multiple datasets and different kinds of models (linear mixed model with one random factor; mixed-effects location scale model), and when init = 0. Stan code for this model (generated with brms 2.17.0): ```stan functions { } data { int<lower=1> N; // total number of observations vector[N] Y; // response variable int<lower=-1,upper=2> cens[N]; // indicates censoring int<lower=1> K; // number of population-level effects matrix[N, K] X; // population-level design matrix int prior_only; // should the likelihood be ignored? } transformed data { int Kc = K - 1; matrix[N, Kc] Xc; // centered version of X without an intercept vector[Kc] means_X; // column means of X before centering for (i in 2:K) { means_X[i - 1] = mean(X[, i]); Xc[, i - 1] = X[, i] - means_X[i - 1]; } } parameters { vector[Kc] b; // population-level effects real Intercept; // temporary intercept for centered predictors real<lower=0> sigma; // dispersion parameter } transformed parameters { real lprior = 0; // prior contributions to the log posterior lprior += student_t_lpdf(Intercept | 3, 99.9, 9); lprior += student_t_lpdf(sigma | 3, 0, 9) - 1 * student_t_lccdf(0 | 3, 0, 9); } model { // likelihood including constants if (!prior_only) { // initialize linear predictor term vector[N] mu = Intercept + Xc * b; for (n in 1:N) { // special treatment of censored data if (cens[n] == 0) { target += normal_lpdf(Y[n] | mu[n], sigma); } else if (cens[n] == 1) { target += normal_lccdf(Y[n] | mu[n], sigma); } else if (cens[n] == -1) { target += normal_lcdf(Y[n] | mu[n], sigma); } } } // priors including constants target += lprior; } generated quantities { // actual population-level intercept real b_Intercept = Intercept - dot_product(means_X, b); } ``` In contrast, left-censoring works: ```r # Left censoring brm(bf(y2 | cens(cen2) ~ x), data = d) ``` I tested this on two machines. ```r # Machine 1 > sessionInfo() R version 4.0.4 (2021-02-15) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19043) Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] brms_2.17.0 Rcpp_1.0.9 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.6 purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.8 [10] ggplot2_3.3.6 tidyverse_1.3.1 # Machine 2 > sessionInfo() R version 4.2.1 (2022-06-23 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044) Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8 LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.utf8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] brms_2.17.0 Rcpp_1.0.9 forcats_0.5.2 stringr_1.4.1 dplyr_1.0.9 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.8 [10] ggplot2_3.3.6 tidyverse_1.3.2 ```

I am definitely not good enough to understand the Stan code :D

Do you guys think that might be also the reason, why mine is not working?
And/or any suggestions on how to proceed, now that we know it is specific to censoring?

Solomon · January 18, 2026, 9:09pm

It’s above my skill set at this point.

drezap · January 19, 2026, 9:48am

I am rusty, so don’t listen to me. But some thoughts. At this point, just go with learning Stan code. BRMS is a one liner in R that generates the Stan code.

The documentation is here, there’s a section on how to implement models with censored data in the documentation: chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://mc-stan.org/docs/2_38/stan-users-guide-2_38.pdf. It’s pretty straight forward.

And yes, rescaling (like (x- mean)/sd), or x / max(x) for x when x \in [0, inf)) will probably help, and you can just un-rescale afterward, just save which scalings you’ve used for which covariate.

And then, are you sure the priors and likelihood you’re using are good for what you’re trying to model? Take a look at literature on how people are modeling arrival times and then go from there. Make sure the assumptions of the model you use match your data… idk I haven’t looked into it much, but i.e. Poisson process inter arrival times follow exponential, so for example you could compute a distribution of interarrival times and see if if it roughly follows an exponential distribution (and if you really want, there are tests, K-S, you could simulate from an exponential and then compare with your distribution of arrival times)… but the Poisson process is assuming one thing is coming at a time, in a sequence, but animals may be leaving at the same time, no idea! That’s where I would head with it. I don’t know if it’s a bug may be your model just doesn’t really fit your data.

Topic		Replies	Views
Left-censored model – estimated parameters off brms	2	298	November 17, 2023
Survival analysis with right censored and interval censored data with brms Modeling ecology , brms	8	2268	July 1, 2025
Mixed (right, left and interval) censored log-normal with brms brms fitting-issues , specification	1	2279	June 17, 2022
Debug model with interval and right censoring brms	7	1340	February 13, 2020
BRM function suddenly stopped running and throws a range of different errors Modeling	26	2326	October 7, 2020

Brms: Right censored model is not converging, while a similar left censored model runs fine

Related topics