Dear Community,
I would like to implement a linear model with censored explanatory variable
y ~ censored(x, censoring_variable)
In other terms the x coordinate sometime can be observed until a certain point, and the real value can be bigger.
So far I have seen just examples of censoring on the response variable (y axis).
bf(y | cens(censor_variable) ~ predictors)
It is like survival model, but I’m not sure why time is still on the response
brm(time|cens(1-status)~1+x, data=..., family=brmsfamily("cox"))
In my case I don’t want the time_to_death given proportion_cancer, but proportion_cancer given time_to_death, if it makes sense.
Thanks a lot
Sorry, it looks like your question fell through - did you manage to resolve it or do you still need help?
1 Like
At the moment I modelled the censored information explicitly adding a positive parameter to the design matrix for the censored covariate to the censored cases. for example for a censored instance I have this linear model
proportions ~ intercept + slope * (days + **unseen_days**)
I am applying it in a dirichlet regression
prop ~ dirichlet(softmax(alpha * X_censoring) * precision);
I get this kind of result
Where on the y I have inferred proportions (red CI for non-censored, and grey squares for CI censored) and on the x the data days (red for the non-censored days to death, and grey square for CI of the inferred days to death). We can see that as expected we have a lot of uncertainty for the unseen days. I am controlling them with a standard censored data modelling of the days distribution
// Modelling the days of death as gamma distribution
real mu = prior_unseen_alpha * exp(-prior_unseen_beta);
// Using here the pure data (X) of days to death, and using cumulative distributions for the ones that are censored
target += gamma_lpdf(X[which_not_cens,2] | prior_unseen_alpha, mu);
target += gamma_lccdf( X[which_cens,2] | prior_unseen_alpha, mu);
// Modelling the days to death for the DAYS + UNSEEN_DAYS
X_[which_cens,2] ~ gamma( prior_unseen_alpha, mu);
However I am not sure modelling this explicitly is the right thing to do. I have the feeling that the censored instances are driven by the regression prior and they go there the regression pushes them to go.
1 Like
This looks very roughly OK to me. In particular, I don’t think you are likely to be able to get away without having an explicit parameter for the censored variables (integrating them out seems difficult/impossible). But there is a lot of implementation stuff I don’t understand from your description so I am not sure the implementation matches the idea - for example, how does X_
relate to X
?
If you want me to look at it in more depth, I would need to see the whole model…