Implementing non-centered parameterization for multivariate autoregressive model

I am modeling three variables, x1, x2, and x3. Each has observations across both countries and time; they are TSCS or panel data in other words, but with ragged panels since the time-series differ in length for each country. I would like to model various relationships among these variables, both contemporaneous and lagged, as in a structural equation or multivariate / vector autoregression setup.

A twist is that all three are unobserved. They are estimated in my model using measurement models and some observed data. Since the measurement part of the model is fine, I have removed most of these features from the stan code below to keep things as simple as possible.

I would like to model the variance of each of the structural equations using non-centered parameters. Yet I can’t figure out how to do this for the last variable, x3. I believe that non-centered parameterizations require one to move the model equation from the model block to the transformed parameters block, using something like x_err = x_ncp * sigma_x, with the non-centered parameters and the variance term then being sampled parameters.

But I can’t include the model for x3 along with the models for x1 and x2 in the transformed parameters block because the x3 model includes a hierarchical component, featuring country averages of all covariates (this is a Mundlak device to obtain within-country estimates of these covariates). These country averages have to be calculated in the loop in the transformed block since x1-x3 are all being estimated in that loop and do not exist as observed data.

This has forced me to place either the x3 model (as below) or the alpha model in the model block, using a centered parameterization of the respective variance term (sigma_x3 or sigma_alpha). This does not seem to converge.

Is there perhaps an indexing trick which would allow me to include all models in the transformed parameters loop, or a way of implementing a non-centered parameterization of a variance term in the model block (or perhaps a better way of modeling this kind of multivariate autoregressive model with latent variables)?

  int<lower=1> J;  // number of countries
  int<lower=1> N;  // number of estimates to be modeled
  int<lower=1,upper=J> jjn[N];  // country j for estimate n
  int<lower=1> est_pos[J];  // indicator showing start point on estimate vector for each cntry
  int<lower=1> est_pos_err[J];  // indicator showing start point on error matrix for each cntry
  int<lower=1> len_ts[J];  // indicator showing length of each cntry time series
  ...				      
  }

parameters{
  matrix[N-J,2] Theta_ncp;  // non-centered parameters for x1 and x2 error terms
  vector[N] x3;  // latent x3 estimates
  vector[J] alpha_ncp;  // non-centered parameters for varying country intercepts, x3 model
  real<lower=0> sigma_x3;  // x3 model error SD
  real<lower=0> sigma_alpha;  // varying country intercepts error SD, x3 model
  corr_matrix[2] Omega_theta;  // correlation matrix x1 & x2 estimates
  vector<lower=0>[2] tau_theta;  // cor -> cov conversion, x1 & x2 estimates
  row_vector[J] x1_init;  // initial latent x1 for year 0
  row_vector[J] x2_init;  // initial latent x2 for year 0
  vector<lower=0,upper=1>[3] zeta_sum;  // structural lag coefs sum lag
  vector<lower=-1,upper=1>[3] zeta2;  // structural 2nd lag coefs
  vector[6] Beta;  // covariate coefficients
  vector[3] mu;  // structural intercepts
  vector[4] Kappa;  // country-level covariate coefficients
  ...
}
  
transformed parameters{
  vector[N] x1;  // latent x1 estimates	
  vector[N] x2;  // latent x2 estimates	
  vector[N] x1_err;  // x1 model residuals	
  vector[N] x2_err;  // x2 model residuals	
  matrix[N-J,2] Theta_err;  // x1 & x2 errors
  matrix[2,2] Sigma_theta;  // variance-covariance matrix, x1 & x2 estimates
  vector[3] zeta1;  // structural models 1st lag coef. 
  vector<upper=1>[3] zeta_diff;  // structural models diff lag 
  vector[J] alpha;  // x3 model varying country intercepts
  vector[J] x1_j;  // country mean estimate x1, lagged once
  vector[J] x2_j;  // country mean estimate x2, lagged once
  vector[J] x3_l1_j;  // country mean estimate x3, lagged once
  vector[J] x3_l2_j;  // country mean estimate x3, lagged twice
  ...
  zeta1 = zeta_sum - zeta2;
  zeta_diff = zeta2 - zeta1;  
  Sigma_theta = quad_form_diag(Omega_theta, tau_theta);  
  Theta_err = Theta_ncp * Sigma_theta; // x1 and x2 error models with non-centered pars
  // structural models of x1 and x2
  for (j in 1:J) {       
    // x1 & x2 models for year 0  
    x1[est_pos[j]] = x1_init[j];   
  	x2[est_pos[j]] = x2_init[j];
	// x1 & x2 models for year 1 
    x1_err[(est_pos[j]+1)] = Theta_err[(est_pos_err[j]), 1]; 
	x2_err[(est_pos[j]+1)] = Theta_err[(est_pos_err[j]), 2]; 
	x2[(est_pos[j]+1)] = mu[2] 
	 + zeta_sum[2] * x2[est_pos[j]]
	 + Beta[3] * (x3[(est_pos[j]+1)] - x3[est_pos[j]]) 
	 + Beta[4] * x3[est_pos[j]] 
	 + x2_err[(est_pos[j]+1)];
    x1[(est_pos[j]+1)] = mu[1] 
	 + zeta_sum[1] * x1[est_pos[j]]
	 + Beta[1] * (x3[(est_pos[j]+1)] - x3[(est_pos[j])]) 
	 + Beta[2] * x3[est_pos[j]] 
	 + x1_err[(est_pos[j]+1)]; 
    // x1 & x2 models for years 2+ 
    for (i in 2:(len_theta_ts[j]-1)) {
      x1_err[(est_pos[j]+i)] = Theta_err[(est_pos_err[j]+i-1), 1]; 
	  x2_err[(est_pos[j]+i)] = Theta_err[(est_pos_err[j]+i-1), 2]; 
	  x2[(est_pos[j]+i)] = mu[2] 
	    + zeta1[2] * x2[(est_pos[j]+i-1)]
        + zeta2[2] * x2[(est_pos[j]+i-2)]	  
	    + Beta[3] * (x3[(est_pos[j]+i)] - x3[(est_pos[j]+i-1)]) 
		+ Beta[4] * x3[(est_pos[j]+i-1)] 
	    + x2_err[(est_pos[j]+i)];
	  x1[(est_pos[j]+i)] = mu[1] 
	    + zeta1[1] * x1[(est_pos[j]+i-1)] 
	    + zeta2[1] * x1[(est_pos[j]+i-2)] 
	    + Beta[1] * (x3[(est_pos[j]+i)] - x3[(est_pos[j]+i-1)]) 
	    + Beta[2] * x3[(est_pos[j]+i-1)]  
	    + x1_err[(est_pos[j]+i)];
	  }
	x1_j[j] = mean(segment(x1, est_pos[j]+1, len_theta_ts[j]-2));
	x2_j[j] = mean(segment(x2, est_pos[j]+1, len_theta_ts[j]-2));  
	x3_l1_j[j] = mean(segment(x3, est_pos[j]+1, len_theta_ts[j]-2));  
	x3_l2_j[j] = mean(segment(x3, est_pos[j], len_theta_ts[j]-2));  
    }
  // varying country intercepts model (for x3 model)
  alpha = Kappa[1] * x3_l1_j 
    + Kappa[2] * x3_l2_j 
    + Kappa[3] * x1_j 
    + Kappa[4] * x2_j 
	+ alpha_ncp * sigma_alpha;
 ...	
}

model{
  sigma_x3 ~ normal(0, 1); 
  sigma_alpha ~ normal(0, 1); 
  tau_theta ~ normal(0, 0.5);
  Omega_theta ~ lkj_corr(2);
  alpha_ncp ~ normal(0, 1); 
  to_vector(Theta_ncp) ~ normal(0, 1); 
  mu ~ normal(0, 1);
  zeta_sum ~ beta(3, 1);
  zeta2 ~ uniform(-1, 1);
  Beta ~ normal(0, 1); 
  Kappa ~ normal(0, 1);
  x1_init ~ normal(0, 1);
  x2_init ~ normal(0, 1);
  // structural model for x3
  for (j in 1:J) {   
    x3[est_pos[j]] ~ normal(0, 1);  // x3 model for year 0
    x3[(est_pos[j]+1)] ~ normal(mu[3]  // x3 model for year 1
	 + zeta_sum[3] * x3[est_pos[j]],
     sigma_x3);
    for (i in 2:(len_theta_ts[j]-1)) {  // x3 model for years 2+
      x3[(est_pos[j]+i)] ~ normal(mu[3] 
	   + zeta1[3] * x3[(est_pos[j]+i-1)] 
	   + zeta2[3] * x3[(est_pos[j]+i-2)] 
	   + Beta[5] * x1[(est_pos[j]+i-1)]
	   + Beta[6] * x2[(est_pos[j]+i-1)]
	   + alpha[jjn[(est_pos[j]+i)]],
       sigma_x3);	   
    }
  }
  ...
}```

Implementing a non-centered parameterization doesn’t need to involve using the TP block. The only difference between the TP & model blocks when it comes to declaring/computing variables is that variables in the TP block:

  1. Will be saved to file in the output (model-block-declared variables will not be saved)
  2. Are permitted to have bounds in their declaration, in which case the sampler will throw an error when their computation falls outside the declared bounds. This may not halt the sampler, but will do so if too many samples lead to a TP falling outside its bounds.

Thanks for the reply @mike-lawrence. I hadn’t considered implementing non-centered parameterizations in the model block. However, I’m not sure how that would work.

To take a simple example – the reparameterization of neal’s funnel used in the handbook – how would one move the non-centered reparameterization of y, for example, to the model block?

parameters {
  real y_raw;
  vector[9] x_raw;
}
transformed parameters {
  real y;
  vector[9] x;

  y = 3.0 * y_raw;
  x = exp(y/2) * x_raw;
}
model {
  y_raw ~ std_normal(); // implies y ~ normal(0, 3)
  x_raw ~ std_normal(); // implies x ~ normal(0, exp(y/2))
}

As simple as cutting the content of TP and pasting into the model block :)

parameters {
  real y_raw;
  vector[9] x_raw;
}
model {
  real y;
  vector[9] x;

  y = 3.0 * y_raw;
  x = exp(y/2) * x_raw;
  y_raw ~ std_normal(); // implies y ~ normal(0, 3)
  x_raw ~ std_normal(); // implies x ~ normal(0, exp(y/2))
}

Though note that there’s also syntax for the declaration that does everything for you:

parameters {
  real<multiplier=3> y;
  vector[9]<multiplier=exp(y/2)> x;
}
model {
  y ~ normal(0,3) ;
  x ~ normal(0, exp(y/2)) ;
}
2 Likes

Interesting, thank you @mike-lawrence.

It seems that you have to move both the y and x statements to the model block don’t you? The following wouldn’t work because y in the model block is not available in the TP block. So it’s either both in the TP block or both in the model block (where neither are saved in the output).

parameters {
  real y_raw;
  vector[9] x_raw;
}
transformed parameters {
  vector[9] x;
  x = exp(y/2) * x_raw;
}
model {
  real y;
  y = 3.0 * y_raw;
  y_raw ~ std_normal(); // implies y ~ normal(0, 3)
  x_raw ~ std_normal(); // implies x ~ normal(0, exp(y/2))
}

Correct. Though if for some reason you don’t want both in the TP block, you can always redo the computation for one or both in the GQ block to save one or both.