I have the following Stan model that I would like to apply to cases where N is of the order of 10^5, ND at most 10 and NF around 20. The following options come to my mind, in order to get more efficient w.r.t. computational resources per effective sample:
- Optimisation of this model through different datatypes. Is there any way to get rid of the last for loop, or is it actually not a problem? Any other changes to the model that could improve it?
- Using a QR decomposition for X.
- Conditioned on X, all models (the normal as well as the logistic regression models) are independent. I could equally well run ND+1 different Stan models, right? I wouldn’t loose anything? What about the run time? Here the benefit is that I can parallelise further, right?
I would appreciate your suggestions and comments. Many thanks.
data {
int<lower=0> N; // number of individuals
int<lower=0> NF; // number of covariates (e.g. age and PC's)
int<lower=0> ND; // number of diseases
vector<lower=0>[N] biomarker; // biomarker
int<lower=0, upper=1> has_disease[ND,N]; // array of vectors each indicating if disease or not
matrix[N, NF] X; // design matrix
}
/******************************************************************************/
parameters {
real<lower=0> sigma_biomarker;
real alpha_biomarker;
vector[ND] alpha_disease;
vector[NF] beta_biomarker;
matrix[ND,NF] beta_disease;
}
/******************************************************************************/
model {
sigma_biomarker ~ cauchy(0,1);
alpha_biomarker ~ normal(80,20);
beta_biomarker ~ normal(0,5);
alpha_disease ~ normal(0,5);
to_vector(beta_disease) ~ normal(0,5);
// we do not add a T[0,] here, since the prior for alpha_biomarker effectively constraints it to be positive
biomarker ~ normal(alpha_biomarker+X*beta_biomarker, sigma_biomarker);
for(i in 1:ND)
has_disease[i] ~ bernoulli_logit(alpha_disease[i] + X*(beta_disease[i])');
}
I know that the efficiency also depends on the data (that is biomarker
, has_disease
and X
), but are there any “algorithmic” tricks that would help in general?