Regression with censored explanatory variable

Dear Community,

I would like to implement a linear model with censored explanatory variable

y ~ censored(x, censoring_variable)

In other terms the x coordinate sometime can be observed until a certain point, and the real value can be bigger.

So far I have seen just examples of censoring on the response variable (y axis).

bf(y | cens(censor_variable) ~ predictors)

It is like survival model, but I’m not sure why time is still on the response

brm(time|cens(1-status)~1+x, data=..., family=brmsfamily("cox"))

In my case I don’t want the time_to_death given proportion_cancer, but proportion_cancer given time_to_death, if it makes sense.

Thanks a lot

Sorry, it looks like your question fell through - did you manage to resolve it or do you still need help?

1 Like

At the moment I modelled the censored information explicitly adding a positive parameter to the design matrix for the censored covariate to the censored cases. for example for a censored instance I have this linear model

proportions ~ intercept + slope * (days + **unseen_days**)

I am applying it in a dirichlet regression

prop ~ dirichlet(softmax(alpha * X_censoring) * precision);

I get this kind of result

Where on the y I have inferred proportions (red CI for non-censored, and grey squares for CI censored) and on the x the data days (red for the non-censored days to death, and grey square for CI of the inferred days to death). We can see that as expected we have a lot of uncertainty for the unseen days. I am controlling them with a standard censored data modelling of the days distribution

// Modelling the days of death as gamma distribution
real mu = prior_unseen_alpha * exp(-prior_unseen_beta);

// Using here the pure data (X) of days to death, and using cumulative distributions for the ones that are censored
target += gamma_lpdf(X[which_not_cens,2] | prior_unseen_alpha, mu);
target += gamma_lccdf(	X[which_cens,2] | prior_unseen_alpha, mu);
	
// Modelling the days to death for the DAYS + UNSEEN_DAYS
X_[which_cens,2] ~ gamma( prior_unseen_alpha, mu);

However I am not sure modelling this explicitly is the right thing to do. I have the feeling that the censored instances are driven by the regression prior and they go there the regression pushes them to go.

1 Like

This looks very roughly OK to me. In particular, I don’t think you are likely to be able to get away without having an explicit parameter for the censored variables (integrating them out seems difficult/impossible). But there is a lot of implementation stuff I don’t understand from your description so I am not sure the implementation matches the idea - for example, how does X_ relate to X?

If you want me to look at it in more depth, I would need to see the whole model…