Survival data modeling using mixture priors

Hi all,

This is my first post to the Stan forums, so if there is any helpful additions or modifications that I can make to my post, please let me know and thank you for your patience.

I have successfully fit a survival model in Stan and RStan using the documentation that is provided in the Stan User Manual. The code is exactly the same as that given in the manual:

data {
  int<lower=0> K;          // num covariates

  int<lower=0> N;          // num uncensored obs
  vector[N] t;             // event time (non-strict decreasing)
  matrix[N, K] x;          // covariates for uncensored obs

  int N_c;                 // num censored obs
  vector[N_c] t_c;    // censoring time
  matrix[N_c, K] x_c;      // covariates for censored obs
}

parameters {
  vector[K] beta;          // slopes (no intercept)
}

model {
  beta ~ normal(0, 3);

  vector[N] log_theta = x * beta;
  vector[N_c] log_theta_c = x_c * beta;

  real log_denom = log_sum_exp(log_theta_c);

  for (n in 1:N) {
    log_denom = log_sum_exp(log_denom, log_theta[n]);
    target += log_theta[n] - log_denom;   // log likelihood
  }

}

I am now trying to fit the same proportional hazards model only using a mixture of normal distributions as the prior, in particular \beta \sim [0.4 \cdot N(0,4)] + [0.6 \cdot N(0,0.24)]. I’ve reviewed what is available in the Stan User Manual on Finite Mixtures, but cannot seem to get started on fitting the model with a finite mixture prior for survival specifically.

The motivation for the prior is to explore the effect if there is prior-data conflict in the setting of a lower number of events, which tends to increase the importance of a strongly informative prior.

Thanks in advance for the help!
M

Hi, @mazin.abdelghany, and welcome to the Stan forums. Sorry it’s been so slow answering questions over the holidays in the U.S.

I don’t know if we mention mixtures for priors, but they work the same as mixtures for observed variables. All you need to do is replace

beta ~ normal(0, 3);

with

for (k in 1:K) {
  target += log_mix(0.4, normal_lpdf(beta[k], 0, 4),
                         normal_lpdf(beta[k], 0, 0.24));
}

The literal translation of what you wrote down, converted to the log scale, would be:

for (k in 1:K) {
  target += log_sum_exp(log(0.4) + normal_lpdf(beta[k], 0, 4),
                        log(0.6) + normal_lpdf(beta[k], 0, 0.24));

That’s because

log_sum_exp(log(p1), log(p2)) = log(p1 + p2)

and

log(0.4 * normal(beta[k] | 0, 4)) = log(0.4) + log(normal(beta[k] | 0, 4))
                                  = log(0.4) + normal_lpdf(beta[k] | 0, 4).

Hi Dr. Carpenter,

No worries at all about the delay in response. Thanks so much for the advice and help!

The replacement for the beta prior works as expected. The only modification I had to make was syntactical:

for (k in 1:K) {
  // _lpdf functions require a | rather than comma
  target += log_mix(0.4, normal_lpdf(beta[k] | 0, 4),
                         normal_lpdf(beta[k] | 0, 0.24));
}

Hope you’re having a restful break.
Best,
Mazin