Modeling hierarchical VAR correlated random effects

I have a VAR model defined as:
Y_{1,it} = \alpha_{1} + \beta_{1i}*Y_{1,i(t-1)} + \beta_{2i}*Y_{2,i(t-1)} +\epsilon_{1,it}
Y_{2,it} = \alpha_{2} + \beta_{3i}*Y_{1,i(t-1)} + \beta_{4i}*Y_{2,i(t-1)} +\epsilon_{2,it}.

\beta_{ji} is defined as \beta_{ji} = \beta_{j} + U_{\beta ji}, for j = 1,2,3,4 where the random effects are defined as
\epsilon_{it} \sim MVN(0, \Sigma)
U_{\beta j} \sim MVN(0, \Phi).

I will also like to point out that \epsilon_{1,it} and \epsilon_{2,it} are correlated. Similarly, all the level-2 random effects U_{\beta j} are correlated.

In order for the model to be stationary, I simulated U_{\beta j} from a multivariate normal distribution with very small variances and no covariances. Therefore, my covariance matrix \Phi is a diagonal matrix, and my correlation matrix is almost an identity matrix.

Everything seems to work but I am having trouble recovering the parameter values for the correlation matrix for the level-2 random effects. I am not sure if I should use lkj_corr_cholesky(10) as prior for the random effects correlation matrix since some of my values are way off


My stan code is

data {
  int N_Y; 
  int N_ID;
  int N_T;
  vector[2] Y[N_ID, N_T];
}

parameters {
  vector[4] UB_I[N_ID]; //random effects
  vector<lower = 0>[4] B_tau; //random effects SD
  cholesky_factor_corr[4] B_ch; //cholesky correlated random effects
  vector[4] Beta; //overall auto and cross lag regression coefficients
  vector[2] alpha; //fixed intercepts
  
  vector<lower = 0>[N_Y] Y_sd; //Level 1 residual sigma
  cholesky_factor_corr[N_Y] Y_ch; //Level 1 residual cholesky corr
}

model {
  vector[N_Y] mu_Y[N_ID,N_T]; //Expected Y
  vector[4] B_I[N_ID]; //Level 2 auto and crossed lag slopes
  
  for(j in 1: 4){
    for(i in 1:N_ID){
      B_I[i,j] = Beta[j] + UB_I[i,j];
    }
  }
  
  for (i in 1:N_ID){
    
    //time = 1
    mu_Y[i,1,1] = alpha[1];
    mu_Y[i,1,2] = alpha[2];
    
    //time > 2
    for(t in 2:N_T){
      mu_Y[i,t,1] = alpha[1] + B_I[i,1]*Y[i,t-1,1] + B_I[i,2]*Y[i,t-1,2];
      mu_Y[i,t,2] = alpha[2] + B_I[i,3]*Y[i,t-1,1] + B_I[i,4]*Y[i,t-1,2];
    }
    
    
    UB_I[i,] ~ multi_normal_cholesky(rep_vector(0,4), diag_pre_multiply(B_tau, B_ch));
    
    for (t in 1:N_T){
      Y[i,t] ~ multi_normal_cholesky(mu_Y[i,t], diag_pre_multiply(Y_sd, Y_ch));
    }
    
  }
  
  //priors
  Y_sd ~ normal(0,3);
  Y_ch ~ lkj_corr_cholesky(1);
  B_ch ~ lkj_corr_cholesky(10);
  B_tau ~ std_normal();
  Beta ~ std_normal();
}

generated quantities {
  corr_matrix[4] B_corr;
  cov_matrix[4] B_cov;
  cov_matrix[N_Y] Y_cov;
  corr_matrix[N_Y] Y_corr;
  
  B_corr = multiply_lower_tri_self_transpose(B_ch);
  B_cov = quad_form_diag(B_corr, B_tau);
  
  Y_corr = multiply_lower_tri_self_transpose(Y_ch);
  Y_cov = quad_form_diag(Y_corr, Y_sd);
}

My code for simulating the data is

N <- 200 #number of individuals
Nt <- 20 #Number of time occasions

eps <- vector("list", length = Nt) #Level-1 residual structure
Y1 <- vector("list", length = Nt)

#Residuals for Structural Model
sigma <- matrix(c(1, .2,.2,1),2)
for (j in 1:Nt) {
  eps[[j]] <- mvtnorm::rmvnorm(N, mean = c(0,0), sigma = sigma)
}

#time 1
  
Y1 <-  eps

cov_beta <- matrix(c(.01, 0, 0, 0, 
                     0, .01, 0, 0,
                     0, 0, .02, 0,
                     0, 0, 0, .05), nrow = 4)

ov_beta <- c(.4, .2, .4,.2) #overall beta_lag

#autoregression slopes
beta_lag <- mvtnorm::rmvnorm(N, mean = ov_beta, sigma = cov_beta)

  #time 2 & beyond
for (j in 2:Nt) {
  Y1[[j]][,1] <- Y1[[j-1]][,1]*beta_lag[,1] + Y1[[j-1]][,2]*beta_lag[,2] + eps[[j]][,1]
  Y1[[j]][,2] <- Y1[[j-1]][,1]*beta_lag[,3] + Y1[[j-1]][,2]*beta_lag[,4] + eps[[j]][,2]
}

Y <- array(0, dim = c(N,Nt,2))
for (j in 1:2) {
  for (t in 1:Nt) {
    for(i in 1:N) {
      Y[i,t,j] = Y1[[t]][i,j]
    }    
  }  
}

1 Like

Hey would you mind publishing some of the outputs from the model? What are the Rhats like? Does the model converge? What do the separate chains look like? What happens when you run it a few times? I.e. same problems or different ones?

Hi emiruz, sorry for the late reply.
So my stan results are:

Inference for Stan model: VAR-202005012253-1-a063aa.
3 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=1500.

             mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
Beta[1]      0.40    0.00 0.02  0.37  0.39  0.40  0.41  0.44  1110 1.00
Beta[2]      0.17    0.00 0.02  0.14  0.16  0.17  0.18  0.20  1789 1.00
Beta[3]      0.39    0.00 0.02  0.35  0.38  0.39  0.40  0.42  1404 1.00
Beta[4]      0.17    0.00 0.02  0.13  0.15  0.17  0.18  0.21  1384 1.00
B_corr[1,1]  1.00     NaN 0.00  1.00  1.00  1.00  1.00  1.00   NaN  NaN
B_corr[1,2] -0.04    0.01 0.14 -0.30 -0.14 -0.04  0.06  0.23   213 1.01
B_corr[1,3] -0.13    0.01 0.14 -0.39 -0.22 -0.13 -0.03  0.14   312 1.00
B_corr[1,4] -0.05    0.01 0.13 -0.30 -0.13 -0.05  0.03  0.20   400 1.01
B_corr[2,1] -0.04    0.01 0.14 -0.30 -0.14 -0.04  0.06  0.23   213 1.01
B_corr[2,2]  1.00     NaN 0.00  1.00  1.00  1.00  1.00  1.00   NaN  NaN
B_corr[2,3]  0.03    0.01 0.14 -0.23 -0.06  0.03  0.12  0.30   416 1.00
B_corr[2,4] -0.09    0.01 0.13 -0.34 -0.18 -0.09  0.01  0.17   375 1.00
B_corr[3,1] -0.13    0.01 0.14 -0.39 -0.22 -0.13 -0.03  0.14   312 1.00
B_corr[3,2]  0.03    0.01 0.14 -0.23 -0.06  0.03  0.12  0.30   416 1.00
B_corr[3,3]  1.00     NaN 0.00  1.00  1.00  1.00  1.00  1.00   NaN  NaN
B_corr[3,4]  0.04    0.01 0.12 -0.19 -0.04  0.04  0.12  0.28   272 1.01
B_corr[4,1] -0.05    0.01 0.13 -0.30 -0.13 -0.05  0.03  0.20   400 1.01
B_corr[4,2] -0.09    0.01 0.13 -0.34 -0.18 -0.09  0.01  0.17   375 1.00
B_corr[4,3]  0.04    0.01 0.12 -0.19 -0.04  0.04  0.12  0.28   272 1.01
B_corr[4,4]  1.00     NaN 0.00  1.00  1.00  1.00  1.00  1.00   NaN  NaN
B_cov[1,1]   0.01    0.00 0.00  0.00  0.01  0.01  0.01  0.02    43 1.02
B_cov[1,2]   0.00    0.00 0.00  0.00  0.00  0.00  0.00  0.00   140 1.02
B_cov[1,3]   0.00    0.00 0.00  0.00  0.00  0.00  0.00  0.00   381 1.00
B_cov[1,4]   0.00    0.00 0.00 -0.01  0.00  0.00  0.00  0.00   381 1.01
B_cov[2,1]   0.00    0.00 0.00  0.00  0.00  0.00  0.00  0.00   140 1.02
B_cov[2,2]   0.01    0.00 0.01  0.00  0.01  0.01  0.01  0.02    34 1.09
B_cov[2,3]   0.00    0.00 0.00  0.00  0.00  0.00  0.00  0.00   523 1.00
B_cov[2,4]   0.00    0.00 0.00 -0.01  0.00  0.00  0.00  0.00   360 1.01
B_cov[3,1]   0.00    0.00 0.00  0.00  0.00  0.00  0.00  0.00   381 1.00
B_cov[3,2]   0.00    0.00 0.00  0.00  0.00  0.00  0.00  0.00   523 1.00
B_cov[3,3]   0.02    0.00 0.01  0.01  0.01  0.02  0.02  0.03    59 1.07
B_cov[3,4]   0.00    0.00 0.00 -0.01  0.00  0.00  0.00  0.01   365 1.01
B_cov[4,1]   0.00    0.00 0.00 -0.01  0.00  0.00  0.00  0.00   381 1.01
B_cov[4,2]   0.00    0.00 0.00 -0.01  0.00  0.00  0.00  0.00   360 1.01
B_cov[4,3]   0.00    0.00 0.00 -0.01  0.00  0.00  0.00  0.01   365 1.01
B_cov[4,4]   0.04    0.00 0.01  0.03  0.03  0.04  0.05  0.06   492 1.01
Y_corr[1,1]  1.00     NaN 0.00  1.00  1.00  1.00  1.00  1.00   NaN  NaN
Y_corr[1,2]  0.22    0.00 0.02  0.19  0.21  0.22  0.23  0.25  2622 1.00
Y_corr[2,1]  0.22    0.00 0.02  0.19  0.21  0.22  0.23  0.25  2622 1.00
Y_corr[2,2]  1.00     NaN 0.00  1.00  1.00  1.00  1.00  1.00   NaN  NaN
Y_cov[1,1]   1.00    0.00 0.02  0.96  0.99  1.01  1.02  1.05   834 1.00
Y_cov[1,2]   0.22    0.00 0.02  0.19  0.21  0.22  0.23  0.25  2665 1.00
Y_cov[2,1]   0.22    0.00 0.02  0.19  0.21  0.22  0.23  0.25  2665 1.00
Y_cov[2,2]   0.99    0.00 0.02  0.94  0.97  0.99  1.00  1.04  2054 1.00

And my initial parameters for the simulation especially for the covariance and correlation matrices of the beta_lag effects are:

> cor(beta_lag)
            [,1]        [,2]        [,3]        [,4]
[1,]  1.00000000 -0.06905013 -0.10550052 -0.09441254
[2,] -0.06905013  1.00000000 -0.03730771 -0.06898384
[3,] -0.10550052 -0.03730771  1.00000000  0.14342457
[4,] -0.09441254 -0.06898384  0.14342457  1.00000000

> cov(beta_lag)
             [,1]         [,2]         [,3]         [,4]
[1,]  0.010591585 -0.000730314 -0.001462815 -0.002078992
[2,] -0.000730314  0.010561592 -0.000516556 -0.001516892
[3,] -0.001462815 -0.000516556  0.018151328  0.004134475
[4,] -0.002078992 -0.001516892  0.004134475  0.045781003

Also, my traceplot for the correlation matrix of the beta_lag is


and the traceplot for the beta_lag covariance matrix is


As you can see, the estimated parameters for the correlation matrix of the beta_lag is an issue. I’m not sure now if it’s because the values of the covariance between the lagged coefficients are too small thereby presenting an issue in the estimation process or the simulation process is flawed.

Just adding the density plots of the correlation plot of the beta_lag parameters


and covariance plot

what are the problems, exactly? I can’t see what’s wrong…
In case it helps, I wrote up a model like this in: https://www.researchgate.net/profile/Charles_Driver/publication/310747801_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling/links/5c4a0a4f458515a4c73da33a/Hierarchical-Bayesian-Continuous-Time-Dynamic-Modeling.pdf

I was thinking the correlation estimated values will match closely to the original values in all cases. I guess I’m being too hard on my model :).
Thanks, very much for sharing your article. It will definitely help me with my model.