Hi all, I’m still pretty new to Stan and have been trying to code up a 3 level longitudinal model, that involves individuals measured over time who themselves are nested within studies. I want to allow multivariate normally distributed random effects to be assigned to the individuals (indrand or b2 in the model code) or to the studies (studrand or b3 in the model code).
The model below runs… but 200 iterations on my (oldish) laptop takes just over 9 hours, on a dataset of about N=30000 individuals nested in K=5 studies, with between 2 and 6 measurements made on each individual. I am trying to find ways to speed up the code below, but I’m unsure where to begin.
I’m running stan from R using rstan. I’m aware that there are package options that contain prewritten code that would fit this particular model. However, once I have this model running well, I’m intending to change various bits to make it more complicated, and so I need to be able to write this code myself rather than just use pre-written code in an available package, in order to produce code for the final model.
Any help or advice would be gratefully appreciated.
data {
//dimensions of data
int<lower=1> M; //number of observations
int<lower=1> N; //number of individuals
int<lower=1> K; //number of studies
int<lower=1> pl; //number long fixed
int<lower=1> ql; //number long indrand
int<lower=0> rl; //number long studrand
//id variables
int<lower=1> idnum[M]; //long ind id variable
int<lower=1> studnum[M]; //long stud id variable
//longitudinal outcome information
vector[M] y1; //longitudinal outcome
//longitudinal fixed information
matrix[M,pl] X_l; //design matrix long fixed
real<lower=0> beta_l_sig; //sd for fixed effects prior
real<lower=0> sigma_e_sig; //involved in prior for sigma_e - sig for longerror
real<lower=0> sigma_e_nu; //involved in prior for sigma_e - sig for longerror
//booleans
int<lower = 0, upper = 1> boo_b3_l; //boolean =1 if longstudrand included, 0 otherwise
//longitudinal indrand information
matrix[M,ql] Z2_l; //design matrix long indrand
vector[ql] b2_ones; //vector of ones for use indrand
real<lower=0> tau_Dl_sig; //involved in prior for tau_Dl - sd for indrand
real<lower=0> tau_Dl_nu; //involved in prior for tau_Dl - sd for indrand
real Omega_Dl_eta; //involved in prior for Omega_Dl - corr mat for indrand
//longitudinal studrand information
matrix[M,rl] Z3_l; //design matrix long studrand
vector[rl] b3_ones; //vector of ones for use studrand
real<lower=0> tau_Al_sig[boo_b3_l ? 1 : 0]; //involved in prior for tau_Al - sd for studrand
real<lower=0> tau_Al_nu[boo_b3_l ? 1 : 0]; //involved in prior for tau_Al - sd for studrand
real Omega_Al_eta[boo_b3_l ? 1 : 0]; //involved in prior for Omega_Al - corr mat for indrand
}
transformed data{
vector[ql] mu_b2; //mean vector for longindrand
vector[rl] mu_b3; //mean vector for longstudrand
mu_b2 = rep_vector(0,ql); //used for mean vector for long indrand
if(boo_b3_l){
mu_b3 = rep_vector(0,rl); //used for mean vector for long studrand
}
}
parameters{
vector[pl] beta_l; //fixed effects for longitudinal component
real<lower=0> sigma_e; //longitudinal error term (sd)
matrix[N,ql] b2; //longitudinal indrand effects
cholesky_factor_corr[ql] Omega_Dl; //Correlation Matrix long indrand
vector<lower=0>[ql] tau_Dl; //variance terms long indrand
matrix[K,rl] b3; //longitudinal studrand effects
cholesky_factor_corr[rl] Omega_Al; //Correlation Matrix long indrand
vector<lower=0>[rl] tau_Al; //variance terms long indrand
}
transformed parameters{
vector[M] linpred_y1; //longitudinal linear predictor
matrix[ql,ql] Dl; //covariance matrix for long indrand
matrix[rl,rl] Al; //covariance matrix for long studrand
linpred_y1 = X_l*beta_l + (Z2_l .* to_matrix(b2[idnum,]))*b2_ones;
Dl = diag_pre_multiply(tau_Dl, Omega_Dl);
if(boo_b3_l){
linpred_y1=linpred_y1 + (Z3_l .* to_matrix(b3[studnum,]))*b3_ones;
Al = diag_pre_multiply(tau_Al, Omega_Al);
}
}
model{
//Priors
tau_Dl ~ student_t(tau_Dl_nu,0,tau_Dl_sig); //prior for sig for long indrand
Omega_Dl~ lkj_corr_cholesky(Omega_Dl_eta); //prior for corr matrix for long indrand
if(boo_b3_l){
tau_Al ~ student_t(tau_Al_nu[1],0,tau_Al_sig[1]); //prior for sig for long studrand
Omega_Al~ lkj_corr_cholesky(Omega_Al_eta[1]); //prior for corr matrix for long studrand
}
beta_l ~ normal(0,beta_l_sig); //prior for fixed effects
sigma_e ~ student_t(sigma_e_nu,0,sigma_e_sig); //prior for longitudinal error term
//Likelihood
y1 ~ normal(linpred_y1,sigma_e); //update longitudinal outcome
for(i in 1:N){
b2[i] ~ multi_normal_cholesky(mu_b2, Dl); //update long indrand
}
if(boo_b3_l){
for(i in 1:K){
b3[i] ~ multi_normal_cholesky(mu_b3, Al); //update long studrand
}
}
}
To include mathematical notation in your post put LaTeX syntax between two $
symbols, e.g.,
p(\theta | y) \propto p(\theta) p(y | \theta).
Don’t forget to add relevant tags to your topic (top right of this form) for application area and/or class of models you work with.