Data Structure for 2-level multilevel VAR

stnorton · June 29, 2020, 7:54pm

Hi all,

I’m attempting to fit a VAR based on code found here, but adding another layer. Essentially, I have individual time series for regions which are nested within countries, and would like to pool on countries. I’m running into trouble setting up the data structure for this. The model code is:

data{
  
  int N; //# observations
  int K; //# dimensions of Y
  int C; //# of countries
  int R; //# of regions
  int T; //# of time periods in the panel
  
  int<lower = 1, upper=C> country[R]; //country id for each region
  int<lower = 1, upper=T> time[N]; //time period id for each obs
  int<lower = 1, upper=R> region[N]; //region id for each obs

  matrix[T,K] Y[R]; //the outcome array - each variable's time series stacked by region
  
}

parameters{
  
  
  //individual level
  vector<lower = 0>[K] tau[C]; //scale for residuals
  matrix[K, K] z_beta[C]; //untransformed betas 
  vector[K] z_alpha[C]; //untransformed intercepts
  
  //hierarchical parameters
  corr_matrix[K] Omega[C]; //country level correlation matrix
  vector[K] tau_loc; //mean for variance scaling factor
  vector<lower=0>[K] tau_scale; //scale for tau
  matrix[K, K] bhat_location; //mean for prior on beta
  matrix<lower = 0>[K, K] bhat_scale; //scale for prior on beta
  vector[K] ahat_location; //means for prior on intercepts
  vector<lower = 0>[K] ahat_scale; //variance for intercept prior
  
  
}

transformed parameters{
  
  matrix[K, K] beta[C]; //VAR(1) coefficients, country specific
  vector[K] alpha[C]; //country specific intercepts

  
  for(c in 1:C){
    //recentering random effects
    alpha[c] = ahat_location + ahat_scale .*z_alpha[c];
    beta[c] = bhat_location + bhat_scale*z_beta[c];
  }
 
}

model{
  
  //hyperpriors
  tau_loc ~ cauchy(0,1);
  tau_scale ~ cauchy(0,1);
  ahat_location ~ normal(0,1);
  ahat_scale ~ cauchy(0, 1); 
  to_vector(bhat_location) ~ normal(0, 0.5);
  to_vector(bhat_scale) ~ cauchy(0, 0.5);

  
  //hierarchical priors
  for(c in 1:C){
    //non-centered paramaterization to avoid convergence issues
    z_alpha[c] ~ normal(0, 1);
    to_vector(z_beta[c]) ~ normal(0, 1);
    tau[c] ~ normal(tau_loc, tau_scale);
    Omega[c] ~ lkj_corr(5);
  }
  
  
  //likelihood
  for(r in 1:R){
    matrix[T, K] y_temp;
    y_temp = Y[1:T, 1:K, r];
    for(t in 1:T){
      if(t > 1){
        y_temp[t] ~ multi_normal(alpha[country[r]] + beta[country[r]]*Y[t-1]',
        quad_form_diag(Omega[country[r]], tau[country[r]]));
      }
    }
  }
  
}

The problem I’m running into is with the sampling statement. The goal here is to split out the time series for each region, and sample use the coefficient for the country for each region. In my head, the easiest way to do this is to create a local temporary matrix that contains only the Y for each region (y_temp) above. However, I can’t figure out how to subset an array in that way - the subset that I attempt creates a vector, which can’t be assigned to a matrix.

This leads to two questions:

How can I properly subset the array in order to run my sampling loop?
Is there a better way to set-up the data structure (particularly with respect to model speed)?

Max_Mantei · July 6, 2020, 11:10pm

Hey! Sorry for the delay. Questions like this are always a bit complicated, because it’s kind of heard to keep track of indexing, especially if the model is so complex.

I could be completely wrong, but since you declared your outcome like this

… I think Y[r] (without more indexes) will return the rth T by K matrix in the array. So this

should be matrix[T, K] y_temp = Y[r].

Again, I could be completely wrong on this one. :/

You can also put print() statements in your Stan program to see which shape your data, variables, parameters etc. have (best with one chain and a handful of iterations, otherwise you’ll be buried in numbers). If all of this doesn’t work, try to make it work for a simpler model and see if you can get the shape correctly.

Some more resources (you might already know) which could be useful:

Hope this helps.

Cheers,
Max

stnorton · July 7, 2020, 4:46pm

Thanks for slogging through the model, that did it! On to making the model sample correctly and maybe eventually, in a reasonable amount of time.

Topic		Replies	Views
Reduce_sum with hierarchical vector auto-regression Modeling	11	1882	July 14, 2020
Speeding up hierarchical VAR Modeling performance	9	1113	May 18, 2021
Improving speed in hierarchical VAR modelling Modeling	0	368	August 22, 2019
Multilevel hierarchical linear regression model with nested predictors Modeling	1	1554	January 9, 2018
How to incorporate differences over countries in multilevel model for panel data Modeling	9	970	January 4, 2019

Data Structure for 2-level multilevel VAR

Related topics