# Covariance is not positive definite

Hi!

I’ve read the book “Statistical Rethinking” from Richard McElreath and now I’m playing with Gaussian processes. I want to simulate some data and therefore need to generate a covariance matrix.

First generated a distance matrix

set.seed(5)
N_PARTICLE=10
particle_distance_matrix <- matrix(runif(N_PARTICLE*N_PARTICLE, 3, 20),nrow=N_PARTICLE,ncol=N_PARTICLE)
diag(particle_distance_matrix) <- 0
particle_distance_matrix[lower.tri(particle_distance_matrix)] = t(particle_distance_matrix)[lower.tri(particle_distance_matrix)]

Then I choose the parameters for the Gaussian kernel and try to generate the covariance matrix

n_sq = 9 # Maximum covariance between any two particles
p_sq = 0.01 # rate of decline of correlation with distance
delta = 0.01 # don't now what it is, but this is used in the book

Next I generated generate the covariance matrix (gaussian kernel). I’ve followed the function that is used in Richard’s book ( cov_GPL2 ): https://github.com/rmcelreath/rethinking/blob/master/R/ulam_templates.R#L1387

covariance_matrix <- matrix(, nrow = N_PARTICLE, ncol = N_PARTICLE)
# Using gaussian kernel
for (i in 1:(N_PARTICLE-1)) {
covariance_matrix[i, i] <- n_sq + delta;
for (j in (i+1):N_PARTICLE) {
covariance_matrix[i,j] = n_sq * exp(-p_sq*(particle_distance_matrix[i,j])^2)
covariance_matrix[j,i] = covariance_matrix[i,j]
}
}
covariance_matrix[N_PARTICLE, N_PARTICLE] = n_sq + delta;

However, if I try to sample now from a multivariate normal using this covariance matrix I get the following error:

mvrnorm(n=20,mu=rep(0, N_PARTICLE), Sigma=covariance_matrix)

Error in mvrnorm(n = 20, mu = rep(0, N_PARTICLE), Sigma = covariance_matrix): 'Sigma' is not positive definite
Traceback:

1. mvrnorm(n = 20, mu = rep(0, N_PARTICLE), Sigma = covariance_matrix)
2. stop("'Sigma' is not positive definite")

Can someone help me what is wrong?

I think the following conditions are probably required for the distance matrix to produce a positive-definite covariance matrix

1. distance to itself is always zero
2. distance from i to j is the same as distance from j to i
3. triangle inequality: distance from i to j is less than distance from i to k plus distance from k to j

If the entries in the so-called distance matrix are IID then the entries in the covariance matrix are also IID and that will generally not be positive-definite. Maybe first draw position vectors and then calculate distances as differences of positions.

I’ve forgot one block when copying it from my jupyter notebook:

diag(particle_distance_matrix) <- 0
particle_distance_matrix[lower.tri(particle_distance_matrix)] = t(particle_distance_matrix)[lower.tri(particle_distance_matrix)]

This is how it is look like:

I think the third requirement may not be met.

Investigating this it appears that p_sq is related to the min distance. If I say that

p^2 = \frac{1}{2l^2}

Then a p_sq of 0.01 corresponds to an l of 7.07. If I set the runif(N_PARTICLE*N_PARTICLE, 8, 20) I can generate a valid covariance matrix most of the time. If you want the min distance to be 3 then setting p_sq \approx 0.06 where l \approx 2.88 seems a good choice.

Thanks a lot!

I think I will go the way @nhuurre suggested and draw my position vectors first and calculate the distances from that.