Hello,
I’m having issues specifying a stan model for a problem with multiple forms of censoring. When running our reliability experiment we add stress to the system in discrete pulses. We can assess whether something has experienced the survival event only after a pulse has ended. Every point y \sim Weibull(\alpha, \sigma whose event is "observed’ is interval censored between [floor(y), ceil(y)]. Additionally, some subjects survive the test and are right censored at test end. Here is my stan code.
functions {
real my_weibull_lcdf(real t, real alpha, real sigma) {
//required to avoid zero point during accumulation, see
//(https://discourse.mc-stan.org/t/
//interval-censored-data-fails-with-weibull-but-not-gamma/28780/3)
return log1m_exp(-pow(t / sigma, alpha));
}
}
data {
int<lower=0> N_intervals; //number of survival interval bins
array[N_intervals] int<lower=0> observed; //censoring codes,
//0 for interval, 1 for right
vector<lower=0>[N_intervals] low_bound; //lower bound of a bin
vector<lower=0>[N_intervals] high_bound;//upper bound of a bin
vector<lower=0>[N_intervals] N_cens; //num censors in a bin
int<lower=0> N_draws; //control posterior predictive draw count
}
parameters {
real<lower=0> alpha;
real<lower=0> sigma;
}
model {
//weak priors for testing
alpha ~ lognormal(0,1);
sigma ~ lognormal(0, 1);
//likelihood
for (n in 1:N_intervals) {
if (observed[n] == 0) { //interval censoring
target += N_cens[n] * log_diff_exp(
my_weibull_lcdf(high_bound[n] | alpha, sigma),
my_weibull_lcdf(low_bound[n] | alpha, sigma)
);
}
else if (observed[n] == 1) { //right censoring
target += N_cens[n] * weibull_lccdf(low_bound[n] | alpha, sigma);
}
}
}
generated quantities {
vector<lower=0>[N_draws] y_tilde;
for (n in 1:N_draws) {
y_tilde[n] = weibull_rng(alpha, sigma);
}
}
This works as I would hope; here is an example fitting some simulated data (here a test that ends at 10):
Where things go off the rails is when I have multiple right censoring times. An example of how this would be achieved in practice is a second measurement on the experimental probe; if a certain signal/noise threshold is breached, we no longer trust the results of our experiment. I want to right-censor this type of observation. In practice I achieve this in my simulation by randomly flipping the censor state of individual observations \sim Bernoulli(p)
In particular, my expectation is that censoring more values this way should result in higher uncertainty estimating parameters in the Weibull but that the collection of realizations should broadly follow the uncensored simulated data. What I am seeing is that the posterior realizations rapidly diverge from observed data as I increase the rate at which data is censored:
This data can reproduce the p=0.6 plot:
p_0_6_right_censored.csv (221 Bytes)
Should I be expecting this? Am I missing something about how right censored data should affect the posterior? That the rate of right censoring affects the result this significantly suggests to me there is something I don’t understand in my model as written. Any help is appreciated!