A simple multivariate normal distribution failed. Anything wrong?

ecosang · March 28, 2021, 6:05am

Hello,
I am testing my toy code for a multivariate normal distribution.
After making a simple toy data, I did inference.
Simply
x\sim \text{MVN}(\mu,\Sigma)
\mu \sim N(3,5)

Also, I put some general priors over \Sigma.
I think this is a very simple problem, but the MCMC fails.
I am wondering if I did any mistake or it is nature of multivariate normal distribution.
I put error messages and seesion info below. But, it looks like all the 4 chains diverge neither they are similar.

Thanks

library(rstan)
library(mvtnorm)

set.seed(1234)
cov=matrix(c(1,0.5,0.3),nrow=3,ncol=1)%*%matrix(c(1,0.5,0.3),nrow=1,ncol=3)
# create data from 
dat=mvtnorm::rmvnorm(n=100,mean=c(0,6,7),sigma=cov)


stan_code='data {
  int<lower=1> N; // number of observations
  int<lower=1> J; // dimension of observations
  vector[J] x[N]; // observations
}
parameters {
  corr_matrix[J] Omega; 
  vector<lower=0>[J] sigma; 
  vector[J] mu; // a vector of Zeros (fixed means of observations)
}
transformed parameters {
  cov_matrix[J] Sigma; 
  Sigma = quad_form_diag(Omega, sigma); 
}
model {
  mu~normal(3,5);
  sigma ~ lognormal(0, 3); // prior on the standard deviations
  Omega ~ lkj_corr(2); // LKJ prior on the correlation matrix 
  x ~ multi_normal(mu,Sigma); // sampling distribution of the observations
  //target += multi_normal_lpdf(x|mu,Sigma);
}'

dat2=list(x=dat,N=dim(dat)[1],J=dim(dat)[2])
fit2=stan(model_code=stan_code,data=dat2,cores=4,iter = 5000,chains=4)

Warning message (BTW, I didn’t try change tree-depth etc… as it is fairly simple problem).

Warning messages:
1: There were 157 divergent transitions after warmup. See
Runtime warnings and convergence problems
to find out why this is a problem and how to eliminate them.
2: There were 9797 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
Runtime warnings and convergence problems
3: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
Runtime warnings and convergence problems
4: Examine the pairs() plot to diagnose sampling problems

5: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
Runtime warnings and convergence problems
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
Runtime warnings and convergence problems
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
Runtime warnings and convergence problems

Session info

R version 4.0.4 (2021-02-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8
[6] LC_MESSAGES=C.UTF-8 LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] bridgesampling_1.0-0 bayesplot_1.8.0 mvtnorm_1.1-1 rstan_2.21.2 ggplot2_3.3.3
[6] StanHeaders_2.21.0-7

loaded via a namespace (and not attached):
[1] Brobdingnag_1.2-6 tinytex_0.30 tidyselect_1.1.0 xfun_0.22 purrr_0.3.4 lattice_0.20-41
[7] V8_3.4.0 colorspace_2.0-0 vctrs_0.3.6 generics_0.1.0 stats4_4.0.4 loo_2.4.1
[13] utf8_1.2.1 rlang_0.4.10 pkgbuild_1.2.0 pillar_1.5.1 glue_1.4.2 withr_2.4.1
[19] DBI_1.1.1 plyr_1.8.6 matrixStats_0.58.0 lifecycle_1.0.0 stringr_1.4.0 munsell_0.5.0
[25] gtable_0.3.0 codetools_0.2-18 coda_0.19-4 labeling_0.4.2 fst_0.9.4 inline_0.3.17
[31] callr_3.5.1 ps_1.6.0 parallel_4.0.4 curl_4.3 fansi_0.4.2 Rcpp_1.0.6
[37] scales_1.1.1 RcppParallel_5.0.3 jsonlite_1.7.2 farver_2.1.0 gridExtra_2.3 digest_0.6.27
[43] stringi_1.5.3 processx_3.5.0 dplyr_1.0.5 grid_4.0.4 cli_2.3.1 tools_4.0.4
[49] magrittr_2.0.1 tibble_3.1.0 crayon_1.4.1 pkgconfig_2.0.3 ellipsis_0.3.1 Matrix_1.3-2
[55] prettyunits_1.1.1 ggridges_0.5.3 rstudioapi_0.13 assertthat_0.2.1 R6_2.5.0 compiler_4.0.4

scellus · March 28, 2021, 11:42am

Your covariance matrix is of rank one:

> cor(dat)
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1
[3,]    1    1    1

In retrospect that is clear from the code, but it was well camouflaged. ;)
(Unrelated to that, an efficient implementation would use Cholesky and an isotropic normal.)

torkar · March 28, 2021, 1:05pm

And the SUG provides some input also :)

ecosang · March 28, 2021, 4:00pm

Thanks a lot. My stupid mistake. I had a wrong equation for cov matirx. Now I think I got correct inference…

set.seed(1234)
mat_temp=matrix(rnorm(9),3,3)
mat_temp[upper.tri(mat_temp)]=0
cov=mat_temp%*%t(mat_temp)
#cov=matrix(c(1,0.5,0.3),nrow=3,ncol=1)%*%matrix(c(1,0.5,0.3),nrow=1,ncol=3)
dat=rmvnorm(n=100,mean=c(0,6,7),sigma=cov)

Topic		Replies	Views
Tip for Fitting Multivariate Normal Modeling	0	1607	August 9, 2019
Some questions about multi_normal distribution Modeling rstan	17	757	August 25, 2021
Error: Log probability evaluates to log(0), i.e. negative infinity. (Multivariate Gaussian) Modeling fitting-issues , performance	11	1820	December 28, 2021
Lkj_corr_lpdf: Correlation matrix is not positive definite General	9	1030	July 12, 2023
Trouble recreating SUG MVN prior example Modeling multivariate-normal	4	113	July 5, 2024

A simple multivariate normal distribution failed. Anything wrong?

Related topics