Reduce_sum with hierarchical vector auto-regression

stnorton · July 9, 2020, 4:31pm

I’m trying to use reduce_sum with a multilevel VAR, and I’m having trouble trying to figure out how to wrangle my data structure to work with reduce_sum (or if it’s even possible).

My units are regions, nested within countries. The model partially pools at the country level (country-level random intercepts and slopes). The data are structured as a 11x5x120 dimension array, where each 11x5 matrix is a region.

I’m running into trouble on how to define the slice, given that it needs to slice an array and each region’s time series must be kept separate (for the lags to make sense). (This is my first time using reduce_sum, so I may also be misunderstanding something about how this works).

This is the code I have so far:

functions{
  real partial_sum(real[] slice_y,
                   int start, int end,
                   matrix beta,
                   matrix alpha,
                   matrix Omega,
                   vector tau,
                   int t,
                   int country,
                   int R,
                   int K
                   )
  
    for(r in 1:R){
     matrix[t, K] y_temp = to_matrix(slice_y[:, :, r]);
    for(tt in 1:t){
      vector[K] lagvars; //lagged variables
     //linear predictor
      if(tt > 1){
      	lagvars = to_vector(y_temp[tt-1,:]);
      	real mu[K]; //linear predictor
        mu = alpha[country[r]] + beta[country[r]]*lagvars;
        return multi_normal_lpdf(y[t]|mu,quad_form_diag(Omega[country[r]], tau[country[r]]))
      }
    }
  }

}

data{
  
  int N; //# observations
  int K; //# dimensions of Y
  int C; //# of countries
  int R; //# of regions
  int t; //# of time periods in the panel
  
  int<lower = 1, upper=C> country[R]; //country id for each region
  int<lower = 1, upper=t> time[N]; //time period id for each obs
  int<lower = 1, upper=R> region[N]; //region id for each obs

  real Y[t, K, R]; //the outcome array - each variable's time series stacked by region
  
}

parameters{
  
  
  //individual level
  vector<lower = 0>[K] tau[C]; //scale for residuals
  matrix[K, K] z_beta[C]; //untransformed betas 
  vector[K] z_alpha[C]; //untransformed intercepts
  
  //hierarchical parameters
  corr_matrix[K] Omega[C]; //country level correlation matrix
  vector[K] tau_loc; //mean for variance scaling factor
  vector<lower=0>[K] tau_scale; //scale for tau
  matrix[K, K] bhat_location; //mean for prior on beta
  matrix<lower = 0>[K, K] bhat_scale; //scale for prior on beta
  vector[K] ahat_location; //means for prior on intercepts
  vector<lower = 0>[K] ahat_scale; //variance for intercept prior
  
  
}

transformed parameters{
  
  matrix[K, K] beta[C]; //VAR(1) coefficients, country specific
  vector[K] alpha[C]; //country specific intercepts

  
  for(c in 1:C){
    //recentering random effects
    alpha[c] = ahat_location + ahat_scale .*z_alpha[c];
    beta[c] = bhat_location + bhat_scale*z_beta[c];
  }
 
}

model{
  
  //hyperpriors
  tau_loc ~ cauchy(0,1);
  tau_scale ~ cauchy(0,1);
  ahat_location ~ normal(0,1);
  ahat_scale ~ cauchy(0, 1); 
  to_vector(bhat_location) ~ normal(0, 0.5);
  to_vector(bhat_scale) ~ cauchy(0, 0.5);

  
  //hierarchical priors
  for(c in 1:C){
    //non-centered paramaterization to avoid convergence issues
    z_alpha[c] ~ normal(0, 1);
    to_vector(z_beta[c]) ~ normal(0, 1);
    tau[c] ~ normal(tau_loc, tau_scale);
    Omega[c] ~ lkj_corr(5);
  }
  
  
  //reduce sum likelihood
  target += reduce_sum(partial_sum, Y, grainsize, beta, alpha,
                        Omega, tau, t, country, R, K)
  
}

Where I run into trouble is the matrix[t, K] y_temp = to_matrix(slice_y[:, :, r]) line. In the original model, this was designed to subset a particular region out of the array. I believe this is where the start and end integers need to go, but I’m not sure how to incorporate that while still respecting the multi-level nature of the data.

I’ve looked at this post, but the nature and structure of their model is different enough from mine (and the code is complex enough) that I haven’t been able to find a clear way to alter it to work with my model.

How can I get this model running with sum_reduce, if possible?

Thanks in advance for any help!

wds15 · July 9, 2020, 6:40pm

Can’t you slice by region?

stnorton · July 9, 2020, 7:04pm

Okay, I think maybe I’m misunderstanding how the slicing works. So would something like this enable me to slice on region? (Some variables that need to be dropped from there in the new setup, but in general.)

functions{
  real partial_sum(int[] r_slice,
                   int start, int end,
                   matrix Y,
                   matrix beta,
                   matrix alpha,
                   matrix Omega,
                   vector tau,
                   int t,
                   int country,
                   int R,
                   int K
                   )
    for(tt in 1:t){
        matrix[t, K] y_temp = to_matrix(Y[r_slice]);
        vector[K] lagvars; //lagged variables
        real mu[K]; //linear predictor
        if(tt > 1){
      	  lagvars = to_vector(y_temp[tt-1,:]);
          mu[t] = alpha[country[t]] + beta[country[t]]*lagvars;
          return multi_normal_lpdf(y[t]|mu,quad_form_diag(Omega[country[t]], tau[country[t]]))
      }
    }
  }

}

wds15 · July 9, 2020, 7:36pm

Slicing works in any way as you define it.

This does not look like it will work

matrix[t, K] y_temp = to_matrix(Y[r_slice]);

This will select from Y as many rows as the slice has, but „t“ is some fixed integer, how is that supposed to fit together?

Can you write your existing model as a for loop over regions?

stnorton · July 9, 2020, 7:51pm

Every region has t observations on k variables, so if the slice is the entire region, that matrix should always be t*K. I was understanding it, slicing by the region indicator would produce each region as a slice?

I originally did loop over regions in the model block, but that code was extremely slow. The problem is that each region is an independent time series, so both the region and the sequence of the data matters - i.e. a slice has to maintain an intact time series for each region. Maybe reduce_sum just isn’t ideal for this data setup?

wds15 · July 9, 2020, 8:04pm

No, if you can write the loop over the regions and each region is independent of the other regions, then that sounds like a good slicing unit to me.

You should probably start to write this such that you write a function which calculates the per region log-lik. Then you write the model as a loop which just calls for each region this function. Once that works, you can start to use reduce_sum where you just loop over region subsets (by using an index of 1:region which you slice over).

Makes sense?

stnorton · July 10, 2020, 2:01pm

Yep - thanks for the help!

stnorton · July 13, 2020, 6:18pm

Sorry, clarification question. When you say slice over an index of 1:region, do you mean that the slice argument to partial_sum() should be the region index or I should feed partial_sum already subset data? I think I’m still struggling with understanding how I control the slicing.

Right now, my partial sum function looks like this:

functions{
    real partial_sum(matrix y_slice, int start, int end, int K, vector r_time, matrix beta_r, vector alpha_r,
    matrix Omega_r, vector tau_r, int t) {
   
  matrix[K, t] mu; //linear predictor
  
    
  for(tt in 1:t){
    vector[K] lagvars;
    if(tt > 1){
      lagvars = to_vector(y_slice[tt-1,:]);
      mu[:,tt] = alpha_r + beta_r*lagvars;
    }
  }
  for(tt in 2:t){
  return multi_normal_lpdf(to_vector(y_slice[tt]) | mu[:,tt], quad_form_diag(Omega_r, tau_r));
  }
  
}
                 
}

And the relevant part of the model block looks like this, where I subset the data for each region and feed it to the function:

 for(r in 1:R){
   //define variables to give to function
   int r_country;
   matrix[t, K] y_func;
   r_country = country[r];
   y_func = to_matrix(Y[:,:,r]);
   
   target += reduce_sum(partial_sum, y_func, grainsize, //y for specific region
   K, //dimensions of output
   to_vector(time[:,r]), //time index for region
   to_matrix(beta[r_country]), // relevant VAR coefficients
   to_vector(alpha[r_country]), //relevant VAR inrecepts
   to_matrix(Omega[:][r_country]), //relevant country-level correlation matrix 
   tau[:][r_country], //relevant country-level scale 
   t); //total number of time periods
   
  }

Should the slicing variable instead be the index that defines the loop (R, which runs from 1:120)?

(I’m guessing I’m on the wrong track, since the existing code throws an ill-typed arguments error Instead supplied arguments of incompatible type: (matrix, int, int, int, vector, matrix, vector, matrix, vector, int) => real, matrix, int, int, vector, matrix, vector, matrix, vector, int )

Thanks for the help so far! I feel like this is one of those things where I’m missing something very obvious…

wds15 · July 13, 2020, 7:16pm

the first argument of the partial-sum function must be an array of whatever is being sliced.

To make progress in this case i suggest you to

Do NOT use reduce_sum to start!
Define two functions: a function for summing a region - call it region_lpdf, for example
define a “partial_sum” function which you pass in the entire data and all parameters, but it does the slicing of the large data set and passes it into region_lpdf one region by another region - basically this is the body of the for loop which you show.
Write in the model block a for loop where you loop 1:R and call the partial_sum function partial_sum(r, r, …);
Once that works, then call in the model things without a loop, but like

target += partial_sum(1, R/2, …);
target += partial_sum(R/2+1, R, …);

and only then plug in reduce_sum to drive it all.

Please go in small steps. Only make progress once you have a firm and correctly working model. Always check that the model is still the same model (by having some data and a parameter draw for which you compute “lp__” as you can do easily in rstan).

stnorton · July 14, 2020, 3:29pm

Ah okay, I think I understand now. To confirm - by defining partial_sum to split the data based on regions, reduce_sum will likewise do so? I had confused myself into thinking that reduce_sum added additional splits on top of partial_sum. Is it correct (in a simplistic sense) to say that reduce_sum just parallelizes the splits created by partial_sum ?

wds15 · July 14, 2020, 3:52pm

reduce_sum will split the data according to what makes sense given the problem size and the resources. The partial_sum function must be able to compute the likelihood contribution from an arbitrary large sub-slice. With multiple cores available, the independent partial sums over the different slices are calculated in parallel, of course.

stnorton · July 14, 2020, 4:29pm

Great, thank you - it finally makes sense to me! I’ll have to rethink the data structure a bit to make that work, but very doable. Thanks so much for the help!

Topic		Replies	Views
Sum_reduce with multilevel VAR-model Modeling	38	1798	May 26, 2020
Reduce_sum() no time saving for multilevel model Modeling techniques	1	450	December 24, 2020
Parallelization via reduce_sum for hierarchical model Modeling performance , hierarchical-model , paralellization	2	640	January 26, 2022
Improving Runtime & Reduce Sum for Hierarchical model with large number of groups Modeling	8	540	October 30, 2023
Data Structure for 2-level multilevel VAR Modeling	2	478	July 7, 2020

Reduce_sum with hierarchical vector auto-regression

Related topics