How to Speed Up AutoRegressive P code


Any help with how to speed up this code would be very helpful. Specifically, I use for loops in some of the transformed paramaters and it would be useful to know how to vectorize these operations considering they are arrays of rowvectors.

Additionally, in the actual sampling, the only ‘large’ sampling is from the 2 vectors that are of length m ( which is ~ 1200). Is it possible for me to reparameterize here?

Any other tips would be very useful. This is a ‘baseline’ model, I plan to extend this to include other features (ie FG3M, FG2A,etc.) and it’s taking ~ 6 hours to run with iterations samples, so I imagine it would be much longer once I start including other things. If it’s any help, this samples very fast ( ie the first 120 samples take about 15 seconds) but after about 350 samples it starts to slow down dramatically.

// In this version we try to make things faster. 
data {
  int P;// each statistic will be an AR(P) so the P is the trailing lag 
  int nteams; // the number of teams 
  int nweeks; // the number of weeks (ie for full season it would be 84) 
  int tot_games;// the total number of games (ie 1230)
  int m; // nweeks*nteams/2

  int<lower=0,upper=31> home_index[nweeks*nteams/2]; // the home index of whoever is playing the i'th game 
  int<lower=0,upper =31> away_index[nweeks*nteams/2]; // the away index of whoever is playing the i'th game 
  int<lower=0> home_game_num[nweeks*nteams/2]; // index of the game number of the home team playing the ith game 
  int<lower=0> away_game_num[nweeks*nteams/2];
  vector<lower = 0>[m] home_FG3A; // the number of 3 pointers attempted by the home team
  vector<lower = 0>[m] away_FG3A; // the number of 3 pointers attempted by the away team


transformed data{
  real coeff=.2;


  vector<lower = 0,upper = 1>[P] FG3A_off_coef;  // this is an AR series with the ability of a team to get 3-point fieldgoals
  vector<lower = 0,upper = 1>[P] FG3A_def_coef; // ar series of a team that shows its ability to not give 3-point fieldgoals  

  row_vector[nteams] wn_fg3a_off; 
  row_vector[nteams] wn_fg3a_def;

  row_vector[nteams] FG3A_off_start; 
  vector<lower = 0>[nteams] sigma_FG3A;// the stdev of fg3a 
  vector[nteams] home_adv_FG3A; // the home advantage of fg3A 
  vector[nteams] deviation_FG3A_def; // the deviation of each team from team 0


transformed parameters{
  row_vector[nteams]  FG3A_off[nweeks]; // the offensive ability of a team to getzzzz FG3s
  row_vector[nteams]  FG3A_def[nweeks]; // the defensive ability of a team to get FG3s 
  FG3A_off[1] = FG3A_off_start;
  FG3A_def[1,1] = 0;
  // allow deviation so that we have a baseline defensive ability 
  for (i in 2:nteams){
    FG3A_def[1,i] = FG3A_def[1,1] + deviation_FG3A_def[i];
  for (j in 2:P){
    FG3A_off[j] = FG3A_off[j-1] + wn_fg3a_off ;
    FG3A_def[j] = FG3A_def[j-1]  + wn_fg3a_def;
  for (i in (P+1):nweeks){
      // 3 point Field Goals 
      FG3A_off[i] = wn_fg3a_off;
      FG3A_def[i] = wn_fg3a_def;
      for (k in 1:P){
        // 3 point Field Goals 
        FG3A_off[i] += FG3A_off_coef[k] *  (FG3A_off[i-k]) ;
        FG3A_def[i] += FG3A_def_coef[k] * (FG3A_def[i-k]) ;



model {
  vector[m] home_fg3a;
  vector[m] away_fg3a;
  vector[m] home_sigma_fg3a;
  vector[m] away_sigma_fg3a;

  FG3A_off_coef ~ normal(coeff,.01);
  FG3A_def_coef ~ normal(coeff,.01);
  FG3A_off_start ~ normal(30,5);
  deviation_FG3A_def ~ normal(0,1);
  home_adv_FG3A ~ normal(3,3);
  sigma_FG3A ~ normal(1,1);
  wn_fg3a_off ~ normal(0,1);
  wn_fg3a_def ~ normal(0,1);

  for (i in 1:m){
      // 2 point field goal percentage 
      home_fg3a[i] = FG3A_off[home_game_num[i],home_index[i]] - FG3A_def[away_game_num[i],away_index[i]]  + home_adv_FG3A[home_index[i]];
      away_fg3a[i] = FG3A_off[away_game_num[i],away_index[i]] - FG3A_def[home_game_num[i],home_index[i]];
      home_sigma_fg3a[i] = sigma_FG3A[home_index[i]];
      away_sigma_fg3a[i] =  sigma_FG3A[away_index[i]];
  home_FG3A ~ normal(home_fg3a,home_sigma_fg3a);
  away_FG3A ~ normal(away_fg3a,away_sigma_fg3a);


This seems to strongly indicate some pathology with the model. If you create a small simulated dataset, can you reliably recover the parameters?