Modeling Multivariate Normal Variance-Covariance

I’m wondering if there’s a better way to specify the following model. I’d like to model the variance-covariance matrices based on the model matrix. Right now, the only way I’m able to get this to work is to sort the matrix so it’s ordered by factor combinations and use if statements to determine which matrix to effect.

I’m sure there must be a better way to do this. Any suggested improvements or resources to look at would be appreciated.

Here is some R code to create an example.

library(rstan)

# Create design, remove and order factor level combinations
x1 <- letters[1:3]
x2 <- letters[24:26]

df <- replicate(20*11, expand.grid(x1, x2), FALSE)
df <- do.call(rbind, df)
df <- df[order(df$Var1, df$Var2), ]
remove <- with(df, Var1 == x1[1] & Var2 == x2[1] | Var1 == x1[3] & Var2 == x2[3] )
df <- df[!remove,]

# Initialize responses and extract model matrix
df$x <- rnorm(nrow(df))
df$y <- rnorm(nrow(df))
mm <- model.matrix(lm(y ~ Var1 + Var2, data = df))

# Define coefficients for "true" standard deviations
x_betas <- c(12, 4, 7, -3, -1)
y_betas <- c(6, 8, 13, -3, -1)

# Calculate "true" standard deviations
sx_true <- as.vector(mm %*% x_betas)
sy_true <- as.vector(mm %*% y_betas)

# Simulate data
combins <- unique(df[, c("Var1", "Var2")])
for(i in 1:nrow(combins)){
  rows <- which(df$Var1 == combins$Var1[i] & df$Var2 == combins$Var2[i])
  df$x[rows] <- rnorm(length(rows), 0, unique(sx_true[rows]))
  df$y[rows] <- rnorm(length(rows), 0, unique(sy_true[rows]))
}

# Data to be passed to Stan
dataList <- list(
  N = nrow(df),                              # Number of observation
#  K = ncol(mm),                              # Number of predictors el, ch (could add interaction)
  J = 2,                                     # Number of responses (x and y)
#  x = mm,                                    # model matrix (extracted from lm earlier)
  y = as.matrix(df[,c("x", "y")])            # matrix of responses
)

The following is my Stan model for the variance-covariance matrices of the x and y responses based on the different factor level combinations.

# Write Stan model
stringModel <- "
data {
  int<lower=0> N;           // number of observations
  int<lower=1> J;           // number of responses
  vector[J] y[N];           // an N x J matrix of responses
}
parameters {
  // 7 different covariance matrices to correspond with 7 different factor level combinations
  cov_matrix[J] sigma1;   // covariance matrix for J responses
  cov_matrix[J] sigma2;   // covariance matrix for J responses
  cov_matrix[J] sigma3;   // covariance matrix for J responses
  cov_matrix[J] sigma4;   // covariance matrix for J responses
  cov_matrix[J] sigma5;   // covariance matrix for J responses
  cov_matrix[J] sigma6;   // covariance matrix for J responses
  cov_matrix[J] sigma7;   // covariance matrix for J responses
}
model {
  for (i in 1:N){
    if (i < 221)
      y[i] ~ multi_normal([ 0, 0 ]', sigma1);
    else if (i < 441)
      y[i] ~ multi_normal([ 0, 0 ]', sigma2);
    else if (i < 661)
      y[i] ~ multi_normal([ 0, 0 ]', sigma3);
    else if (i < 881)
      y[i] ~ multi_normal([ 0, 0 ]', sigma4);
    else if (i < 1101)
      y[i] ~ multi_normal([ 0, 0 ]', sigma5);
    else if (i < 1321)
      y[i] ~ multi_normal([ 0, 0 ]', sigma6);
    else
      y[i] ~ multi_normal([ 0, 0 ]', sigma7);
  }
}"
1 Like

Hi,
I am not sure I undersatnd your model completely, and thus I am unsure whether there is a better way to build the mathematical model. But if I treat the Stan model you shared as a given, I think the solution would be to create an array of covariance matrices:

cov_matrix[J] sigma[K];

then you could pass the group as data and have:

for (i in 1:N){
      y[i] ~ multi_normal([ 0, 0 ]', sigma[group[i]]);
}

Note however, that your approach is equivalent to just fitting a separate model for each category.

Additionally 1.15 Multivariate outcomes | Stan User’s Guide is probably relevant for a faster parametrization of multivariate normal.

Best of luck with your model!

1 Like

Thank you. That was definitely helpful for rewriting this in a simpler format and making this a better parameterization. My model code is now:

data {
  int<lower=0> N;         // number of observations
  int<lower=1> J;         // number of responses
  vector[J] y[N];         // an array of J=2 response vectors each of length N
  int<lower=1> K;         // number of groups
  int group[N];           // a vector of length N indicating group membership
}
parameters {
  cholesky_factor_corr[J] L_Omega[K];
  vector<lower=0>[J] L_sigma[K];
}
model {
  matrix[J, J] L_Sigma[K];
  
  for(i in 1:K){
    L_Omega[i] ~ lkj_corr_cholesky(1.0);
    L_sigma[i] ~ cauchy(0, 2.5);
    L_Sigma[i] = diag_pre_multiply(L_sigma[i], L_Omega[i]);
  }
  
  for(k in 1:N){
    y[k] ~ multi_normal_cholesky( [0,0]' , L_Sigma[group[k]] );
  }
}
generated quantities {
  corr_matrix[J] Omega[K];  //matrix[J,J] Omega[K];
  cov_matrix[J] Sigma[K];   //matrix[J,J] Sigma[K];

  for(i in 1:K){
    Omega[i] = multiply_lower_tri_self_transpose(L_Omega[i]);
    Sigma[i] = quad_form_diag(Omega[i], L_sigma[i]);
  }
}

You noted that this approach is

equivalent to just fitting a separate model for each category.

In what way could I change this so that’s not the case? What I really care about is how the variance of the two response variables change as a function of the factors, but this is the only way I was able to think of how to set this up. And I didn’t really want to break this into a separate model for each response as there’s potential that they are correlated.

1 Like

My statement above might have been a bit confusing - the model takes into account correlations between responses (the J elements of each y[n, ]). However, it is equivalent to fitting a completely separate multivariate normal to the subset of y that correspond to each of the K groups. I.e. you assume that the correlation structure within each group is independent of the other groups. If that’s desirable/OK than no problem, just wanted to highlight this. (also it is probably quite hard to introduce a sensible structure here)…

Does that make sense?

1 Like

Yes, that all makes sense. Thanks for the help. It certainly helped with some of the syntax and moreover helped me realize that I should work through some of the StanCon materials on YouTube and GitHub.