Performance issues regarding hierarchical spatial quantile regression

Hi, first-time user here. I’m having performance issues with a hierarchical spatial quantile model application. In short, we have a two-level school-region hierarchical approach. At the region level, we use quantile regression, bringing information from the school level using a spatial autoregression.

I have already done simulations with 500 schools inside 90 regions, the model is correctly implemented. However, the real data have 2.500 schools and the estimated time for 1000 transitions with 10 leapfrogs would take 40.000-60.000 seconds. For comparison, the same model with 500 schools would take something about 500-2.000 seconds.

functions {
  real g(real gamma) {
    real gg; 
    gg = 2 * normal_cdf(-fabs(gamma),0,1) * exp(gamma^2 / 2); 
    return gg;

data {
  int<lower=1> n; //total of counties
  int<lower=1> m; //total of schools
  int<lower=1> k; //total of counties covariates
  int<lower=1> r; //total of school covariates
  real L; // gamma lower bound
  real U; // gamma upper bound
  real<lower=0, upper=1> t0; //quantile of interest
  vector[m] ys; // school response vector
  matrix[m,n] H; // point matrix
  matrix[m,r] G; // school design matrix
  matrix[n,n] W; // proximity matrix
  matrix[n,k] X; // county design matrix

transformed data {
  matrix[n,n] I = diag_matrix(rep_vector(1, n)); // generic identity matrix

parameters {
  vector[k] beta; // school regressors
  vector[r] theta; // county regressors
  real<lower=-1,upper=1> phi; // spatial parameter
  real<lower=L,upper=U> gamma; // shape parameter
  real<lower=0> sigma_c; // county variability
  real<lower=0> sigma_s; // school variability
  vector<lower=0>[m] v; // latent variable
  vector<lower=0>[m] z; // latent variable

transformed parameters {
 real<lower=0,upper=1> t; // quantile
 real a; 
 real<lower=0> b;
 real c;

 t =  (gamma<0) + (t0-(gamma<0))/g(gamma);
 a = (1 - 2 * t) / (t * (1 - t));
 b = 2 / (t * (1 - t));
 c = pow((gamma>0)-t,-1);

model {
 matrix[m,n] D = H*inverse(I - phi * W);  //spatial multiplier 
 vector[m]   Mu = D*X*beta + G*theta + a*v + sigma_s*c*fabs(gamma)*z;   //multivariate normal likelihood vector of means 
 matrix[m,m] Sigma = sigma_c*D*D' + b*sigma_s*diag_matrix(v); //multivariate normal likelihood covariance matrix 
 z ~ normal(0,1);  // latent variable prior
 v ~ exponential(pow(sigma_s, -1)); //latent variable prior
 ys ~ multi_normal(Mu, Sigma); //likelihood
 beta ~ normal(0,100);  //beta prior
 theta ~ normal(0,100);  //theta prior
 phi ~ uniform(-1,1);  //phi prior
 sigma_s ~ cauchy(0,100);  //school sigma prior
 sigma_c ~ cauchy(0,100);   //county sigma prior
 gamma ~ student_t(1,0,1);  // shape prior

To me, it seems that the extensive usage of matrices is somewhat expensive. However, that’s how the model is constructed, using a multivariate normal distribution as likelihood function. Another situation is that the parameters sigma_s and gamma are heavily correlated, according to the literature.

Is there some good practice to speed up things that I’m missing here?

I will really appreciate any kind of help!