Un-vectorizing a multinormal model results in non-convergence

jerome · June 9, 2022, 11:28am

I am experimenting with fitting data to a multi-normal model. I used Jake Ying’s example code as a starting point.

Here’s the R code:

N <- 100
mu <- c(1, 2, -5) # means
Q <- matrix(c(1, 0.7, 0.2, # correlation matrix (Q)
             0.7, 1, -0.5,
             0.2, -0.5, 1),
            nrow = 3, byrow = TRUE)
sigmas <- c(0.5, 1.2, 2.3) # sd1=0.5, sd2=1.2, sd3=2.3
Sigma <- diag(sigmas) %*% Q %*% diag(sigmas) # VCV matrix
set.seed(123)
samples <- MASS::mvrnorm(N, mu = mu, Sigma = Sigma) # data simulation

multinorm_dat <- list(N = N,
                      K = nrow(Q),
                      y = samples)

fit <- stan(file = "multinorm.stan", data = multinorm_dat)

and here’s multinorm.stan:

 data {
  int<lower=0> N; // number of observations
  int<lower=1> K; // dimension of observations
  vector[K] y[N]; // observations: a list of N vectors (each has K elements)
}

parameters {
  vector[K] mu;
  cholesky_factor_corr[K] Lcorr; // cholesky factor (L_u matrix for R)
  vector<lower=0>[K] sigma;
}

transformed parameters {
  corr_matrix[K] Q; // correlation matrix
  cov_matrix[K] Sigma; // VCV matrix
  Q = multiply_lower_tri_self_transpose(Lcorr); // R = Lcorr * Lcorr'
  Sigma = quad_form_diag(Q, sigma); // quad_form_diag: diag_matrix(sig) * R * diag_matrix(sig)
}

model {
  sigma ~ cauchy(0, 5); // prior for sigma
  Lcorr ~ lkj_corr_cholesky(2.0); // prior for cholesky factor of a correlation matrix
  y ~ multi_normal(mu, Sigma);
}

When I run the model above on the test data, I get quick convergence and healthy chains. However if I change the model block slightly as below, and iterate over each y vector, instead of using the fully vectorized form over the array of y vectors, then I get no convergence at all, and n_eff values of 2, instead of 2,000+.

model {
  sigma ~ cauchy(0, 5); // prior for sigma
  Lcorr ~ lkj_corr_cholesky(2.0); // prior for cholesky factor of a correlation matrix
  for (i in 1:N) {
    y[N] ~ multi_normal(mu, Sigma);
  }
}

I realize that the first, fully vectorized form of the model is preferable. However I would like to understand why looping, besides being slower, doesn’t converge.

simonbrauer · June 9, 2022, 1:15pm

Looks like a typo. It should be

for (i in 1:N) {
    y[i] ~ multi_normal(mu, Sigma);
  }

Topic		Replies	Views
Multivariate normal with highly correlated data Modeling fitting-issues , performance	16	1791	December 4, 2018
Inference for multinormal on cholesky-transformed data fails to recover sensible estimates Modeling fitting-issues	3	52	January 17, 2025
Multivariate normal data covariance estimation Modeling performance , covariance , multivariate-normal	3	1449	May 10, 2020
Possibly improving the performance of Mulivariantnormal Distribution Modeling	2	474	February 28, 2020
Coping with varying curvature Modeling fitting-issues	9	750	August 24, 2019

Un-vectorizing a multinormal model results in non-convergence

Related topics