I’ve been modelling some lifetime data that is right censored. In my analysis, I am imputing the values of the right censored lifetimes following the guidance of the Stan user’s guide, but I am unsure about the correct terminology for the prior I place on the “missing” censored lifetimes. In Stan code, the model is;
data {
int N_observed;
int N_censored;
vector<lower = 0>[N_observed] y_observed;
vector<lower = 0>[N_censored] y_censored_lb;
vector<lower = 0>[N_censored] y_censored_ub;
}
parameters {
real<lower = 0> beta;
real<lower = 0> eta;
vector<lower = y_censored_lb, upper = y_censored_ub>[N_censored] y_censored;
}
model {
eta ~ normal(1, 1);
beta ~ normal(1.1, 1);
y_observed ~ weibull(beta, eta);
y_censored ~ weibull(beta, eta);
}
Would it be correct to say that I am assigning a truncated prior to the missing data? Or is this incorrect because I’m not normalising the distributions by using the T[]
function? e.g.
y_censored ~ weibull(beta, eta) T[y_censored_lb, y_censored_ub];
I don’t use T[]
in this model because later, in a different version of the model, I also include left truncated lifetimes.
For the vast majority of applications, it is safe to drop normalizing constants in Stan (when they are truly constant in the parameters). In fact, you are already dropping normalizing constants when you write y_censored ~ weibull(beta, eta);
. So in this sense, it’s fine to say that you are sampling from the truncated distribution even when you don’t explicitly renormalize.
For what it’s worth, you are in a sort of murky setting where it’s challenging to separate what a prior is from what a likelihood is. In particular, you don’t know a priori which data will be censored, which is a hint that it’s weird to talk about putting a prior on the censored values. I’m not aware of clearly defined rules for how to approach talking about his stuff. In my personal opinion, although y_censored
is nominally a parameter, I would be more inclined to think about the declaration of y_censored
(which is data-dependent) and the statement y_censored ~ weibull(beta, eta);
as part of the machinery for computing the likelihood, rather than as part of your prior.
Thanks, @jsocolar. This was very helpful. I agree with your last thoughts on referring to y_censored
as part of the machinery for computing the likelihood; it’s a nice way of putting it.