Optimize Stan code - hierarchical model

Hi,

Is there any better way to parametrize this hierarchical model?

Thank you,

data {
 int<lower=0> N;  //number of observations
 real<lower=0> CPS[N];  //Cost per service (response variable) 
 int<lower=0> Njkm;  //total number of state-service-market (168 total, 28 do not have observed CPS)
 int<lower=0> Nkm;  //number of market services (8)
 //int<lower=1> jkm_identifier[N]; //state-service-market identifier level 1
 //int<lower=1> km_identifier[N]; //service-market identifier level 1
 //int<lower=0> gender_identifier[N]; //population gender effect
 matrix[N,Njkm] design_matrix_1;
 matrix[N,Nkm] design_matrix_2;
 matrix[Njkm,Nkm] design_matrix_3;
 matrix[Njkm,24] design_matrix_4;
}
transformed data {
  }
parameters {
// real beta_0;
 vector[Njkm] mu_jkm;
 vector[Nkm] gamma_km;
 vector[24] beta_km;
 vector<lower=0>[Nkm] sigma_cps_km;
 vector<lower=0>[Nkm] sigma_mu_km;
 //real theta_g; //Population gender effect
}
transformed parameters {
     vector[Njkm] gamma_km_index = design_matrix_3 * gamma_km + design_matrix_4 * beta_km ; 
     vector<lower=0>[Njkm] sigma_mu_index = design_matrix_3 * sigma_mu_km;
     
     vector[N] mu_index =design_matrix_1 * mu_jkm;
     vector<lower=0>[N] sigma_cps_index = design_matrix_2 * sigma_cps_km;

    for (n in 1:N){
      mu_index[n]=exp(mu_index[n]);
    }
}
model { 
//Priors
//target += normal_lpdf(beta_0jkm | beta_km, sigma_u0jkm);

//likelihood
//for (i in 1:N){
//CPS[i] ~ normal (mu[i], sigma_cps[i]);
target += normal_lpdf(CPS |mu_index, sigma_cps_index);
target += normal_lpdf(mu_jkm |gamma_km_index, sigma_mu_index);
}
generated quantities{
   real yrep[N];
   for (i in 1:N)
   yrep[i]= normal_rng(mu_index[i], sigma_cps_index[i]);
}

So in this thread @avehtari gives OP the following advise:

if you’re looking to reparameterise because the model runs slow or doesn’t converge then @avehtari advise RE priors might apply here too since none of the parameters seem to have priors in your model unless I’ve misunderstood something.

2 Likes

Yes, thank you.
Another point, is this matrix notation the most efficient way to write the problem?

Just IMHO, and assuming your objective is performance, I would tend to think about it in terms of (1) Am I performing any operations that I don’t need to? (2) Have I vectorised as much as possible?

According to this, the exp function is vectorised. so you can write :

mu_index=exp(mu_index);

instead of:

for (n in 1:N){
      mu_index[n]=exp(mu_index[n]);
    }

Similarly, this says that normal_rng is vectorised so you can write this:

yrep=normal_rng(mu_index, sigma_cps_index);

instead of this:

 for (i in 1:N)
   yrep[i]= normal_rng(mu_index[i], sigma_cps_index[i]);

Except for that, I don’t see anything else personally, but others might.

I can’t vectorize yrep:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable definition base type mismatch, variable declared as base type vector variable definition has base type real[ ] error in ‘model49c4729690b_stan_CPS_11’ at line 50, column 54

48: }
49: generated quantities{
50: vector[N] yrep = normal_rng(mu_index, sigma_cps_index);
                                                         ^
51: }

Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘stan_CPS_11’ due to the above error.

Just make it an array instead ?

1 Like