Debug model with interval and right censoring

  • Operating System: macOS
  • brms Version: 2.11.5

I’m trying to fit an intercept-only model with Gaussian data that has both interval and right censoring. The uncensored version of the data are \operatorname{Normal} (100, 15). They are interval censored from 94 to 100 and right censored from 106 on. Here are the data:

library(tidyverse)
library(brms)

# simulate the initial data
n <- 500

set.seed(25)

d <- tibble(y = rnorm(n, mean = 100, sd = 15))

# define the censoring thresholds
t1 <-  94
t2 <- 100
t3 <- 106

# update the data
d <-
  d %>% 
  mutate(y1  = if_else(y >= t1 & y < t2, t1,
                       if_else(y > t3, t3, y)),
         y2  = if_else(y >= t1 & y < t2, t2,
                       ifelse(y > t3, NA, y)),
         cen = if_else(y >= t1 & y < t2, "interval",
                       if_else(y > t3, "right", "none")))

The y1 column defines the lower bounds, y2 defines the upper bounds, and cen defines the type of censoring. Here’s my brm() code.

fit <-
  brm(data = d,
      family = gaussian,
      y1 | cens(cen, y2) ~ 1,
      prior = c(prior(normal(100, 15), class = Intercept),
                prior(normal(0, 15), class = sigma)),
      seed = 25)

This returns a long list of

Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.

that terminates in

Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."

Setting inits = 0 is no help. Where am I going wrong?

Hi Solomon,
I think there is not enough probability mass at the initial values for the parameter Intercept, (-2,2) is too far from the data. If you force the initial value for the Intercept to be near 100, it works.

n_chains <- 4
init_ll <- lapply(1:n_chains, function(id) list(Intercept = 100) )

fit <-
    brm(data = d,
        family = gaussian,
        y1 | cens(cen, y2) ~ 1,
        prior = c(prior(normal(100, 15), class = Intercept),
                  prior(normal(0, 15), class = sigma)),
        , chains = n_chains
        , inits = init_ll
        ,    seed = 25)

Also brms will remove the rows containing NAs in y2, you could put a 0 instead.

1 Like

Ah. This makes sense. Thanks @lott999!

Also, thank you for the insight on y2. I didn’t realize brms was dropping those cases. Is it the behavior, then, that when the censoring indicator reads "none" that the value in y2 won’t matter as long as it’s not missing?

If you look at the data generated by brms,

dat = make_standata(data = d,
            family = gaussian,
            y1 | cens(cen, y2) ~ 1)

dat$cens are the censoring labels recoded as {-1,0,1,2} for {‘left’,‘none’,‘right’,‘interval’) and dat$rcens is y2 but where all values have been changed to 0 except for those corresponding to an interval censoring in which case it should be the higher bound of the interval and the lower bound should be given in the response, as you have done.

1 Like

You can also see in the stan code generated by brms how the data are used

for (n in 1:N) {
      // special treatment of censored data
      if (cens[n] == 0) {
        target += normal_lpdf(Y[n] | mu[n], sigma);
      } else if (cens[n] == 1) {
        target += normal_lccdf(Y[n] | mu[n], sigma);
      } else if (cens[n] == -1) {
        target += normal_lcdf(Y[n] | mu[n], sigma);
      } else if (cens[n] == 2) {
        target += log_diff_exp(
          normal_lcdf(rcens[n] | mu[n], sigma),
          normal_lcdf(Y[n] | mu[n], sigma)
        );
      }
    }

rcens is only used when cens is at 2

Exactly. The values if y2 will be ignored in non interval censored cases

1 Like

Thanks y’all. I can confirm the model is now up and working just like I’d hoped. Success!

@paul.buerkner the write-up about censoring in the brms reference manual doesn’t quite make it clear that a variable like y2 would be ignored in cases where cen != 'interval'. Could you spell that out a bit in the documentation?

Good point. I updated the doc accordingly.

1 Like