Variable Selection in Parametric Survival Analysis Models

Easy speedups: 1) you tagged is as rstan, so switch to CmdStanR to use the latest Stan version, 2) add to make/local (can be done from R) CXXFLAGS += -march=native -mtune=native can drop computation time 50% , 3) use stanc_options = list("O1") when calling cmdstan_model, even in smaller problems reduced computation time 25%

Further speedup can be obtained by using OpenBLAS or MKL to use more threads without changing the code Speedup by using external BLAS/LAPACK with CmdStan and CmdStanR/Py

Further speedup with code changes (or use brms to build your code) is to use sum_reduce

All these without GPUs.

Further speedup could be obtained by making the posterior easier to sample

  • this is not necessarily the best choice for big data as the likelihood is very informative, and it’s possible this will create a bad funnel
  beta = z_bs .* beta_sigma;
  beta_ind[,1] = z_ibs1 .* beta_ind_sigma1;
  beta_ind[,2] = z_ibs2 .* beta_ind_sigma2;
  • why do you have sigmas as vectors? You told P<<N, and then as I see, these could be scalars, that would make the posterior much easier
  vector<lower=0>[P] beta_sigma; // Coefficient for numeric predictors
  vector<lower=0>[P_ind] beta_ind_sigma1; // Stdev for index variable =1 
  vector<lower=0>[P_ind] beta_ind_sigma2; //Stdev for index variable = 2 
  • all exponential priors are suspicious

If you can provide further information about the posterior convergence diagnostics, ESSs and some mcmc_pairs plots, they might provide additional useful information to help you.

2 Likes