I would like to implement a linear model with censored explanatory variable
y ~ censored(x, censoring_variable)
In other terms the x coordinate sometime can be observed until a certain point, and the real value can be bigger.
So far I have seen just examples of censoring on the response variable (y axis).
bf(y | cens(censor_variable) ~ predictors)
It is like survival model, but I’m not sure why time is still on the response
brm(time|cens(1-status)~1+x, data=..., family=brmsfamily("cox"))
In my case I don’t want the time_to_death given proportion_cancer, but proportion_cancer given time_to_death, if it makes sense.
Thanks a lot
Sorry, it looks like your question fell through - did you manage to resolve it or do you still need help?
At the moment I modelled the censored information explicitly adding a positive parameter to the design matrix for the censored covariate to the censored cases. for example for a censored instance I have this linear model
proportions ~ intercept + slope * (days + **unseen_days**)
I am applying it in a dirichlet regression
prop ~ dirichlet(softmax(alpha * X_censoring) * precision);
I get this kind of result
Where on the y I have inferred proportions (red CI for non-censored, and grey squares for CI censored) and on the x the data days (red for the non-censored days to death, and grey square for CI of the inferred days to death). We can see that as expected we have a lot of uncertainty for the unseen days. I am controlling them with a standard censored data modelling of the days distribution
// Modelling the days of death as gamma distribution
real mu = prior_unseen_alpha * exp(-prior_unseen_beta);
// Using here the pure data (X) of days to death, and using cumulative distributions for the ones that are censored
target += gamma_lpdf(X[which_not_cens,2] | prior_unseen_alpha, mu);
target += gamma_lccdf( X[which_cens,2] | prior_unseen_alpha, mu);
// Modelling the days to death for the DAYS + UNSEEN_DAYS
X_[which_cens,2] ~ gamma( prior_unseen_alpha, mu);
However I am not sure modelling this explicitly is the right thing to do. I have the feeling that the censored instances are driven by the regression prior and they go there the regression pushes them to go.
This looks very roughly OK to me. In particular, I don’t think you are likely to be able to get away without having an explicit parameter for the censored variables (integrating them out seems difficult/impossible). But there is a lot of implementation stuff I don’t understand from your description so I am not sure the implementation matches the idea - for example, how does
X_ relate to
If you want me to look at it in more depth, I would need to see the whole model…