Speed up sampling of Weibull Survival Model

I have Weibull model for left-truncated and right-censored data with time-varying predictors. The code below runs about 2.5-3 hours on 15k records. I need to run this on about 300k records. Can anyone spot any obvious improvements that I can make to run this faster? I added some comments and simplified where I can, I hope it helps.

I’m also wondering if having three different parameter vectors (all parameters distributed normal(0,1)) are less efficient than having a big one, and just accessing the correct indices where needed? Or is there perhaps a way to avoid the for-loop where I use the index variables and vectorize appropriately? Not sure where I can win i.t.o. efficiency, any advice will be appreciated.

data {
  int<lower=0> N; //Number of observations
  int<lower=0> P; // Number of numeric predictors
  int<lower=0> P_ind; // Number of factor variables (binary)
  vector[N] d; // Survival event variable 1:Death, 0:Censored
  vector<lower=0>[N] stop; // age at stop time
  vector<lower=0>[N] start; // age at start time
  int<lower = 1> X_ind[N, P_ind]; // Index matrix for categorical variables. Values are either 1 or 2
  matrix[N, P] X;  //Design matrix for numeric variables
}

parameters {
  real<lower=0> alpha; //Shape parameter of Weibull
  vector<lower=0>[P] beta_sigma; // Coefficient for numeric predictors
  vector<lower=0>[P_ind] beta_ind_sigma1; // Stdev for index variable =1 
  vector<lower=0>[P_ind] beta_ind_sigma2; //Stdev for index variable = 2 
  vector[P] z_bs; // Normal(0, 1) value for non-centered parameterisation for numeric variables.
  vector[P_ind] z_ibs1; // Normal(0, 1) value for non-centered parameterisation for index variables.
  vector[P_ind] z_ibs2;
}
transformed parameters {
  vector[P] beta; 
  matrix[P_ind, 2] beta_ind; //Two columns representing parameters for index variables
  beta = z_bs .* beta_sigma;
  beta_ind[,1] = z_ibs1 .* beta_ind_sigma1;
  beta_ind[,2] = z_ibs2 .* beta_ind_sigma2;
}

model {
  vector[N] sigma;
  vector[N] sigma_ind_linear;
  vector[P_ind] tempsum;

// Priors
  beta_sigma ~ exponential(3);
  beta_ind_sigma1 ~ exponential(3);
  beta_ind_sigma2 ~ exponential(3);
  z_bs ~ normal(0, 1);
  z_ibs1 ~ normal(0, 1);
  z_ibs2 ~ normal(0, 1);
  alpha ~ gamma(2, 2);

// Model
// Adding up index variables contribution to target
  for(i in 1:N){
    for(j in 1:P_ind){
        tempsum[j] =  beta_ind[j, X_ind[i, j]];
    }
    sigma_ind_linear[i] = sum(tempsum);
  }
// Linear model
  sigma = exp((X*beta) + sigma_ind_linear)+0.001; // adding delta for numerical stability.
  
  for(i in 1:N){
    if(d[i]==0){
      target += weibull_lccdf(stop[i]|alpha, sigma[i]) - weibull_lccdf(start[i]| alpha, sigma[i]);
    }
    else {
      target += weibull_lpdf(stop[i]|alpha, sigma[i]) - weibull_lccdf(start[i]| alpha, sigma[i]);
    }
  }
}

Although some of the questions at the top remains, a preliminary answer in a related thread can be seen at Variable Selection in Parametric Survival Analysis Models.

1 Like

There are a couple of things you can do to speed this up. First, this loop can be simplified to avoid all the extra assignment,

for (i in 1:N){
  for (j in 1:P_ind){
    tempsum[j] =  beta_ind[j, X_ind[i, j]];
  }
  sigma_ind_linear[i] = sum(tempsum);
}

can be just

for (n in 1:N) 
  sigma_ind_linear[n] = sum(beta_ind[ , X_ind[i, ]);

Your transformed parameters can be tightened to this, while transposing beta_ind.

vector[P] beta = z_bs .* beta_sigma;
array[2] matrix[P_ind] beta_ind = { z_ibs1 .* beta_ind_sigma1, z_ibs2 .* beta_ind_sigma2 };

This’ll let you use beta_ind[X_ind[i, ]] efficiently in the first expression.

I would suggest removing the + 0.001 part of this. If the rest is underflowing to 0, you want to fix this with a prior or reparameterization, not by adding a small number.

sigma = exp((X*beta) + sigma_ind_linear)+0.001; // adding delta for numerical stability.

The model can be sped up considerably with vectorization. The conditional on d[I] == 0 is a killer and should be done once in transformed data. That way, you create a vector of stop[i] and sigma[i] for the censored and uncensored case, then you can implement with vectorization as follows.

target += weibull_lccdf(censored_stop | alpha, censored_sigma);
target += weibull_lupdf(obs_stop | alpha, obs_sigma);

I also changed the basic Weibull to lupdf from lpdf, which drops the unnecessary normalizing terms.

I’d also check the model is well specified in the priors—if those mismatch the actual data, it can slow things down considerably.

1 Like

Wow thank you so much Bob, it always amazes me how you guys take some of your time to help others, and the speed with which you make sense of another person’s code is crazy. Since I have this post I obviously continued to optimise and followed the recommendations in the link posted above.

However, the optimisations were mainly focused on the compiler side (changing flags, switching to CmdStan, etc.). The one suggestion I have implemented in the meantime was to split the observed and censored data which made a massive difference (in line with your last suggestion). Also, @avehtari pointed out that for such a large sample size, the centered parameterization might be faster, which also turned out to be true, so I removed the transformed variables part.

That said, I’m keen to implement your other suggestions, it makes so much sense.

Just for clarification, do you suggest I create an array with vector of size P_ind for the beta_ind parameter? Because you have matrix declaration below?

Also, are saying that having an array of vectors is more efficient than the matrix I declared? I eventually switched over to an array in my code, but it wasn’t obvious to me how to vectorise that.

Yes, I removed this as soon as I switched to separate observed and censored parameters, definitely made a difference.

I didn’t know that this was available, I used Stan version 2.21.0, but I see this is available from 2.25.0. I make a few updates on my side.

1 Like