I am experimenting with fitting data to a multi-normal model. I used Jake Ying’s example code as a starting point.
Here’s the R code:
N <- 100
mu <- c(1, 2, -5) # means
Q <- matrix(c(1, 0.7, 0.2, # correlation matrix (Q)
0.7, 1, -0.5,
0.2, -0.5, 1),
nrow = 3, byrow = TRUE)
sigmas <- c(0.5, 1.2, 2.3) # sd1=0.5, sd2=1.2, sd3=2.3
Sigma <- diag(sigmas) %*% Q %*% diag(sigmas) # VCV matrix
set.seed(123)
samples <- MASS::mvrnorm(N, mu = mu, Sigma = Sigma) # data simulation
multinorm_dat <- list(N = N,
K = nrow(Q),
y = samples)
fit <- stan(file = "multinorm.stan", data = multinorm_dat)
and here’s multinorm.stan
:
data {
int<lower=0> N; // number of observations
int<lower=1> K; // dimension of observations
vector[K] y[N]; // observations: a list of N vectors (each has K elements)
}
parameters {
vector[K] mu;
cholesky_factor_corr[K] Lcorr; // cholesky factor (L_u matrix for R)
vector<lower=0>[K] sigma;
}
transformed parameters {
corr_matrix[K] Q; // correlation matrix
cov_matrix[K] Sigma; // VCV matrix
Q = multiply_lower_tri_self_transpose(Lcorr); // R = Lcorr * Lcorr'
Sigma = quad_form_diag(Q, sigma); // quad_form_diag: diag_matrix(sig) * R * diag_matrix(sig)
}
model {
sigma ~ cauchy(0, 5); // prior for sigma
Lcorr ~ lkj_corr_cholesky(2.0); // prior for cholesky factor of a correlation matrix
y ~ multi_normal(mu, Sigma);
}
When I run the model above on the test data, I get quick convergence and healthy chains. However if I change the model block slightly as below, and iterate over each y
vector, instead of using the fully vectorized form over the array of y
vectors, then I get no convergence at all, and n_eff
values of 2, instead of 2,000+.
model {
sigma ~ cauchy(0, 5); // prior for sigma
Lcorr ~ lkj_corr_cholesky(2.0); // prior for cholesky factor of a correlation matrix
for (i in 1:N) {
y[N] ~ multi_normal(mu, Sigma);
}
}
I realize that the first, fully vectorized form of the model is preferable. However I would like to understand why looping, besides being slower, doesn’t converge.