Simulation Study of Survival-Longitudinal Data Joint Modeling - Some Estimations Are Off

Hello Everyone,

I’m trying to do a simulation study of survival-longitudinal joint modeling. Currently working on the complete-case situation (i.e., no censoring). Since this is a simulation, I know the true parameters. It seems that the estimated coefficients of the covariates (X) in both the survival sub-model and longitudinal sub-model are quite off. Others are acceptable. I wonder if there’s any specific reason for it.

I would really appreciate any suggestions that can speed up the computation. Thanks a lot!

Best,
Li

Model

Survival Sub-Model

\lambda_i = \exp\{ \alpha_0 + \sum_{k=1}^p \alpha_k x_{ik} + \gamma_1 I_i + \gamma_2 S_i + \gamma_3 \log(V_i) \} \tag{1.1}
h_i(t) = \rho \lambda_i t^{\rho-1} \iff T_i^{surv} \sim \mathcal{Weibull}(\rho, \lambda_i^{-\frac{1}{\rho}}) \tag{1.2}
T_i^{obs} = \min(T_i^{cens}, T_i^{obs}), \quad T_i^{cens} \perp T_i^{surv} |X_i \tag{1.3}

Longitudinal Sub-Model

Y_{ij} = \beta_0 + \sum_{k=1}^p \beta_k x_{ik} + \tau t_{ij} + I_i + S_i t_{ij} + e_{ij}, \quad t_{ij} \le T_i^{obs} \tag{2.1}
I_i \sim \mathcal{N}(0, \sigma_I^2) \tag{2.2}
S_i \sim \mathcal{Weibull}(\rho, \lambda_i) \tag{2.3}
e_{ij} \sim \mathcal{N}(0, V_i) \tag{2.4}
\log(V_i) \sim \mathcal{N}(\mu_V, \sigma_V^2) \tag{2.5}

Data Simulation

########################## true parameters ##########################
#### Scenario A: biomarker variability not associated with survival
m = 300 # number of patients
p = 3 # number of covariates
n = sample(x = 10:50, size = m, replace = T) # number of check-ups
### Survival
rho = 2
alpha_0 = -1.5
alphas = c(0.5, -0.75, 1.0)
gamma_1 = 0.2
gamma_2 = 1
gamma_3 = 0
### Longitudinal
beta_0 = 24 
betas = c(-0.5, 2, 1)
tau = 0
sigma_I = 2
sigma_S = 0.5
mu_V = 1.5
sigma_V = 0.25
#### Scenario B: biomarker variability associated with survival
gamma_3 = 0.5
#####################################################################
# Simulation
set.seed(123)
X = matrix(nrow = m, ncol = p)
Y = list()
e = list()
I = numeric(m)
S = numeric(m)
V = numeric(m)
lambda = numeric(m)
T_surv = numeric(m)
T_cens = numeric(m)
T_obs = numeric(m)
status = numeric(m)
t = list()
T_obs = numeric(m)
for (i in 1:m) {
  X[i, ] <- rnorm(p, mean = 0, sd = 1)
  I[i] <- rnorm(1, mean = 0, sd = sigma_I)
  S[i] <- rnorm(1, mean = 0, sd = sigma_S)
  V[i] <- rlnorm(1, meanlog = mu_V, sdlog = sigma_V)
  # survival sub-model
  lambda[i] <- exp(alpha_0 + sum(alphas * X[i]) + gamma_1 * I[i] + gamma_2 * S[i] + gamma_3 * log(V[i]))
  T_surv[i] <- rweibull(1, rho, (lambda[i] ^ (- 1 / rho)))
  T_cens[i] <- runif(1, min = 0, max = 12)
  T_cens[i] <- Inf # no censor
  T_obs[i] <- min(T_cens[i], T_surv[i])
  status[i] <- as.numeric(T_surv[i] <= T_cens[i])
  # longitudinal sub-model
  t[[i]] <- sort(runif(n[i])) * T_obs[i]
  e[[i]] <- rnorm(n[i], 0, V[i])
  Y[[i]] <- beta_0 + I[i] + sum(betas * X[i]) + (tau + S[i]) * t[[i]] + e[[i]]
}  
#####################################################################
# Model Fitting
sm <- stan_model("joint_model_cc.stan")
stan_data <- list(
  m = m, 
  p = p,
  n = n,
  N = sum(n),
  X = X, 
  y = unlist(Y),
  t = unlist(t),
  T_obs = T_obs
)
fit <- sampling(sm, data = stan_data, seed = 42, chains = 8, cores = 4, iter = 10000)
data {
  int<lower=1> m; // number of patients
  int<lower=0> p; // number of covariates
  int n[m]; // array of observations per patient
  int<lower=1> N; // number of observations    
  matrix[m, p] X; // covariates matrix
  vector[N] y; // flattend longitudinal outcome
  vector[N] t; // flattend check-up points
  vector[m] T_obs; // survival time
}

parameters {
  // Survival Sub-Model
  real<lower=0> rho; // suvival sub-model Weibull distribution shape
  real alpha_0; // survival sub-model intercept
  vector[p] alphas; // survival sub-model covariates coefficients
  real gamma_1; // survival sub-model random intercept coefficient
  real gamma_2; // survival sub-model random slope coefficient
  real gamma_3; // survival sub-model random variability coefficient
  
  // Longitudinal Sub-Model
  real beta_0; // longitudinal sub-model intercept
  vector[p] betas; // longitudinal sub-model covariates coefficients
  real tau; // longitudinal fixed time coefficient
  real<lower=0> sigma_I; // random intercept standard deviation
  real<lower=0> sigma_S; // random slope standard deviation
  real mu_V; // random variability log-normal mean
  real<lower=0> sigma_V; // random variability log-normal standard deviation
  
  vector[m] I; // random intercept
  vector[m] S; // random slope
  vector<lower=0>[m] V; // random variability
}

model{
  int pos = 0; 
  vector[m] lambda; // survival sub-model Weibull distribution scale 
  vector[N] mu_y; // longitudinal sub-model subject-level mean 
  // priors
  rho ~ gamma(1, 1);
  alpha_0 ~ normal(0, 1);
  alphas ~ normal(0, 1);
  gamma_1 ~ normal(0, 1);
  gamma_2 ~ normal(0, 1);
  gamma_3 ~ normal(0, 1);
  beta_0 ~ normal(24, 1);
  betas ~ normal(0, 1);
  tau ~ normal(0, 1);
  sigma_I ~ lognormal(0, 1);
  sigma_S ~ lognormal(0, 1);
  mu_V ~ normal(0, 1);
  sigma_V ~ lognormal(0, 1);
  // likelihood 
  for (i in 1:m) {
    I[i] ~ normal(0, sigma_I);
    S[i] ~ normal(0, sigma_S);
    V[i] ~ lognormal(mu_V, sigma_V);
    // survival 
    lambda[i] = exp(alpha_0 + dot_product(alphas, X[i, ])  + gamma_1 * I[i] + gamma_2 * S[i] + gamma_3 * log(V[i]));
    T_obs[i] ~ weibull(rho, (lambda[i] ^ (- 1 / rho)));
    // longitudinal 
    for (j in 1:n[i]) {
      pos += 1;
      mu_y[pos] = beta_0 + I[i] + dot_product(betas, X[i, ]) + (tau + S[i]) * t[pos]; 
      y[pos] ~ normal(mu_y[pos], V[i]);
    }
  }
}

Summary Table

          true_vals  mean se_mean   sd  2.5%   25%   50%   75% 97.5%    n_eff Rhat
rho            2.00  1.97    0.00 0.13  1.74  1.88  1.95  2.04  2.26  4548.22 1.00
alpha_0       -1.50 -1.19    0.00 0.39 -1.95 -1.45 -1.19 -0.92 -0.44 19487.15 1.00
alphas[1]      0.50  0.86    0.00 0.09  0.68  0.79  0.85  0.91  1.05  8129.91 1.00
alphas[2]     -0.75  0.01    0.00 0.07 -0.12 -0.04  0.01  0.05  0.15 18144.00 1.00
alphas[3]      1.00 -0.03    0.00 0.08 -0.18 -0.08 -0.03  0.02  0.12 20695.98 1.00
gamma_1        0.20  0.24    0.00 0.04  0.16  0.22  0.24  0.27  0.32 25617.94 1.00
gamma_2        1.00  0.85    0.01 0.38  0.18  0.62  0.84  1.07  1.62  2916.70 1.00
gamma_3        0.50  0.30    0.00 0.25 -0.20  0.13  0.30  0.47  0.79 20114.21 1.00
beta_0        24.00 23.97    0.00 0.14 23.68 23.87 23.97 24.06 24.25  6523.86 1.00
betas[1]      -0.50  2.67    0.00 0.13  2.42  2.58  2.67  2.76  2.92  7571.03 1.00
betas[2]       2.00 -0.14    0.00 0.13 -0.40 -0.23 -0.15 -0.06  0.11  8278.45 1.00
betas[3]       1.00  0.02    0.00 0.14 -0.26 -0.07  0.02  0.12  0.30  6892.24 1.00
tau            0.00 -0.05    0.00 0.13 -0.30 -0.14 -0.06  0.03  0.21  1783.79 1.01
sigma_I        2.00  1.99    0.00 0.10  1.80  1.92  1.99  2.06  2.20 28977.09 1.00
sigma_S        0.50  0.49    0.01 0.12  0.24  0.41  0.49  0.57  0.72   307.64 1.03
mu_V           1.50  1.50    0.00 0.02  1.46  1.49  1.50  1.51  1.53 37618.18 1.00
sigma_V        0.25  0.27    0.00 0.01  0.24  0.26  0.27  0.28  0.30 27609.39 1.00

Hi! Welcome to the Stan forum!

I’m not into survival models, but I think quite a few people on this Forum ar (maybe I can tag them later).

Did you get warnings about divergent transitions? Or any other warnings?

This looks fishy. Did you try non-centered parameterization here? (The easiest is probably to write vector<multiplier=0,offset=sigma_S>[m] S in the paramteres block.)
Also, you could try with tighter priors on the sigmas (lognormal has a heavy tail).

(All of the above is assuming your model is correct, which I did not check, though.)

Cheers,
Max

It worked! Thank you very much, Max! I wasn’t aware of affine transformation before. One more question though, does the affine transformation differ from code the non-centered parameterization manually?

Edit: sorry for asking before research, I actually found this: Affine transformation in Stan

Follow-up: I tried both methods. Unfortunately, the affine transformation didn’t work as well as manually coded non-centered parameterization in terms of the number of effective samples. However, with the same seed, their estimations are very alike. Any explanation?

Affine transformation:

Manually coded NCP: