# Implementing non-centered parameterization for multivariate autoregressive model

I am modeling three variables, x1, x2, and x3. Each has observations across both countries and time; they are TSCS or panel data in other words, but with ragged panels since the time-series differ in length for each country. I would like to model various relationships among these variables, both contemporaneous and lagged, as in a structural equation or multivariate / vector autoregression setup.

A twist is that all three are unobserved. They are estimated in my model using measurement models and some observed data. Since the measurement part of the model is fine, I have removed most of these features from the stan code below to keep things as simple as possible.

I would like to model the variance of each of the structural equations using non-centered parameters. Yet I canâ€™t figure out how to do this for the last variable, x3. I believe that non-centered parameterizations require one to move the model equation from the model block to the transformed parameters block, using something like x_err = x_ncp * sigma_x, with the non-centered parameters and the variance term then being sampled parameters.

But I canâ€™t include the model for x3 along with the models for x1 and x2 in the transformed parameters block because the x3 model includes a hierarchical component, featuring country averages of all covariates (this is a Mundlak device to obtain within-country estimates of these covariates). These country averages have to be calculated in the loop in the transformed block since x1-x3 are all being estimated in that loop and do not exist as observed data.

This has forced me to place either the x3 model (as below) or the alpha model in the model block, using a centered parameterization of the respective variance term (sigma_x3 or sigma_alpha). This does not seem to converge.

Is there perhaps an indexing trick which would allow me to include all models in the transformed parameters loop, or a way of implementing a non-centered parameterization of a variance term in the model block (or perhaps a better way of modeling this kind of multivariate autoregressive model with latent variables)?

``````  int<lower=1> J;  // number of countries
int<lower=1> N;  // number of estimates to be modeled
int<lower=1,upper=J> jjn[N];  // country j for estimate n
int<lower=1> est_pos[J];  // indicator showing start point on estimate vector for each cntry
int<lower=1> est_pos_err[J];  // indicator showing start point on error matrix for each cntry
int<lower=1> len_ts[J];  // indicator showing length of each cntry time series
...
}

parameters{
matrix[N-J,2] Theta_ncp;  // non-centered parameters for x1 and x2 error terms
vector[N] x3;  // latent x3 estimates
vector[J] alpha_ncp;  // non-centered parameters for varying country intercepts, x3 model
real<lower=0> sigma_x3;  // x3 model error SD
real<lower=0> sigma_alpha;  // varying country intercepts error SD, x3 model
corr_matrix[2] Omega_theta;  // correlation matrix x1 & x2 estimates
vector<lower=0>[2] tau_theta;  // cor -> cov conversion, x1 & x2 estimates
row_vector[J] x1_init;  // initial latent x1 for year 0
row_vector[J] x2_init;  // initial latent x2 for year 0
vector<lower=0,upper=1>[3] zeta_sum;  // structural lag coefs sum lag
vector<lower=-1,upper=1>[3] zeta2;  // structural 2nd lag coefs
vector[6] Beta;  // covariate coefficients
vector[3] mu;  // structural intercepts
vector[4] Kappa;  // country-level covariate coefficients
...
}

transformed parameters{
vector[N] x1;  // latent x1 estimates
vector[N] x2;  // latent x2 estimates
vector[N] x1_err;  // x1 model residuals
vector[N] x2_err;  // x2 model residuals
matrix[N-J,2] Theta_err;  // x1 & x2 errors
matrix[2,2] Sigma_theta;  // variance-covariance matrix, x1 & x2 estimates
vector[3] zeta1;  // structural models 1st lag coef.
vector<upper=1>[3] zeta_diff;  // structural models diff lag
vector[J] alpha;  // x3 model varying country intercepts
vector[J] x1_j;  // country mean estimate x1, lagged once
vector[J] x2_j;  // country mean estimate x2, lagged once
vector[J] x3_l1_j;  // country mean estimate x3, lagged once
vector[J] x3_l2_j;  // country mean estimate x3, lagged twice
...
zeta1 = zeta_sum - zeta2;
zeta_diff = zeta2 - zeta1;
Theta_err = Theta_ncp * Sigma_theta; // x1 and x2 error models with non-centered pars
// structural models of x1 and x2
for (j in 1:J) {
// x1 & x2 models for year 0
x1[est_pos[j]] = x1_init[j];
x2[est_pos[j]] = x2_init[j];
// x1 & x2 models for year 1
x1_err[(est_pos[j]+1)] = Theta_err[(est_pos_err[j]), 1];
x2_err[(est_pos[j]+1)] = Theta_err[(est_pos_err[j]), 2];
x2[(est_pos[j]+1)] = mu[2]
+ zeta_sum[2] * x2[est_pos[j]]
+ Beta[3] * (x3[(est_pos[j]+1)] - x3[est_pos[j]])
+ Beta[4] * x3[est_pos[j]]
+ x2_err[(est_pos[j]+1)];
x1[(est_pos[j]+1)] = mu[1]
+ zeta_sum[1] * x1[est_pos[j]]
+ Beta[1] * (x3[(est_pos[j]+1)] - x3[(est_pos[j])])
+ Beta[2] * x3[est_pos[j]]
+ x1_err[(est_pos[j]+1)];
// x1 & x2 models for years 2+
for (i in 2:(len_theta_ts[j]-1)) {
x1_err[(est_pos[j]+i)] = Theta_err[(est_pos_err[j]+i-1), 1];
x2_err[(est_pos[j]+i)] = Theta_err[(est_pos_err[j]+i-1), 2];
x2[(est_pos[j]+i)] = mu[2]
+ zeta1[2] * x2[(est_pos[j]+i-1)]
+ zeta2[2] * x2[(est_pos[j]+i-2)]
+ Beta[3] * (x3[(est_pos[j]+i)] - x3[(est_pos[j]+i-1)])
+ Beta[4] * x3[(est_pos[j]+i-1)]
+ x2_err[(est_pos[j]+i)];
x1[(est_pos[j]+i)] = mu[1]
+ zeta1[1] * x1[(est_pos[j]+i-1)]
+ zeta2[1] * x1[(est_pos[j]+i-2)]
+ Beta[1] * (x3[(est_pos[j]+i)] - x3[(est_pos[j]+i-1)])
+ Beta[2] * x3[(est_pos[j]+i-1)]
+ x1_err[(est_pos[j]+i)];
}
x1_j[j] = mean(segment(x1, est_pos[j]+1, len_theta_ts[j]-2));
x2_j[j] = mean(segment(x2, est_pos[j]+1, len_theta_ts[j]-2));
x3_l1_j[j] = mean(segment(x3, est_pos[j]+1, len_theta_ts[j]-2));
x3_l2_j[j] = mean(segment(x3, est_pos[j], len_theta_ts[j]-2));
}
// varying country intercepts model (for x3 model)
alpha = Kappa[1] * x3_l1_j
+ Kappa[2] * x3_l2_j
+ Kappa[3] * x1_j
+ Kappa[4] * x2_j
+ alpha_ncp * sigma_alpha;
...
}

model{
sigma_x3 ~ normal(0, 1);
sigma_alpha ~ normal(0, 1);
tau_theta ~ normal(0, 0.5);
Omega_theta ~ lkj_corr(2);
alpha_ncp ~ normal(0, 1);
to_vector(Theta_ncp) ~ normal(0, 1);
mu ~ normal(0, 1);
zeta_sum ~ beta(3, 1);
zeta2 ~ uniform(-1, 1);
Beta ~ normal(0, 1);
Kappa ~ normal(0, 1);
x1_init ~ normal(0, 1);
x2_init ~ normal(0, 1);
// structural model for x3
for (j in 1:J) {
x3[est_pos[j]] ~ normal(0, 1);  // x3 model for year 0
x3[(est_pos[j]+1)] ~ normal(mu[3]  // x3 model for year 1
+ zeta_sum[3] * x3[est_pos[j]],
sigma_x3);
for (i in 2:(len_theta_ts[j]-1)) {  // x3 model for years 2+
x3[(est_pos[j]+i)] ~ normal(mu[3]
+ zeta1[3] * x3[(est_pos[j]+i-1)]
+ zeta2[3] * x3[(est_pos[j]+i-2)]
+ Beta[5] * x1[(est_pos[j]+i-1)]
+ Beta[6] * x2[(est_pos[j]+i-1)]
+ alpha[jjn[(est_pos[j]+i)]],
sigma_x3);
}
}
...
}`````````

Implementing a non-centered parameterization doesnâ€™t need to involve using the TP block. The only difference between the TP & model blocks when it comes to declaring/computing variables is that variables in the TP block:

1. Will be saved to file in the output (model-block-declared variables will not be saved)
2. Are permitted to have bounds in their declaration, in which case the sampler will throw an error when their computation falls outside the declared bounds. This may not halt the sampler, but will do so if too many samples lead to a TP falling outside its bounds.

Thanks for the reply @mike-lawrence. I hadnâ€™t considered implementing non-centered parameterizations in the model block. However, Iâ€™m not sure how that would work.

To take a simple example â€“ the reparameterization of nealâ€™s funnel used in the handbook â€“ how would one move the non-centered reparameterization of y, for example, to the model block?

``````parameters {
real y_raw;
vector[9] x_raw;
}
transformed parameters {
real y;
vector[9] x;

y = 3.0 * y_raw;
x = exp(y/2) * x_raw;
}
model {
y_raw ~ std_normal(); // implies y ~ normal(0, 3)
x_raw ~ std_normal(); // implies x ~ normal(0, exp(y/2))
}
``````

As simple as cutting the content of TP and pasting into the model block :)

``````parameters {
real y_raw;
vector[9] x_raw;
}
model {
real y;
vector[9] x;

y = 3.0 * y_raw;
x = exp(y/2) * x_raw;
y_raw ~ std_normal(); // implies y ~ normal(0, 3)
x_raw ~ std_normal(); // implies x ~ normal(0, exp(y/2))
}
``````

Though note that thereâ€™s also syntax for the declaration that does everything for you:

``````parameters {
real<multiplier=3> y;
vector[9]<multiplier=exp(y/2)> x;
}
model {
y ~ normal(0,3) ;
x ~ normal(0, exp(y/2)) ;
}
``````
2 Likes

Interesting, thank you @mike-lawrence.

It seems that you have to move both the y and x statements to the model block donâ€™t you? The following wouldnâ€™t work because y in the model block is not available in the TP block. So itâ€™s either both in the TP block or both in the model block (where neither are saved in the output).

``````parameters {
real y_raw;
vector[9] x_raw;
}
transformed parameters {
vector[9] x;
x = exp(y/2) * x_raw;
}
model {
real y;
y = 3.0 * y_raw;
y_raw ~ std_normal(); // implies y ~ normal(0, 3)
x_raw ~ std_normal(); // implies x ~ normal(0, exp(y/2))
}
``````

Correct. Though if for some reason you donâ€™t want both in the TP block, you can always redo the computation for one or both in the GQ block to save one or both.