Stans pickyness regarding covariance matrices



stan seems quite picky with regard to covariance matrices. Specifically, it throws an error if a covariance matrix is semi-definite, which still is a valid covariance matrix.

The following MWE demonstrates this.

  int<lower=1> maxTime;
  vector[maxTime] Y;

  real theVar;

transformed parameters{
  vector[maxTime] mu;
  matrix[maxTime,maxTime] Sigma;
  for (i in 1:maxTime){

  for (i in 1:maxTime){
    for( j in 1:maxTime){
      Sigma[i,j] = theVar;

    Y ~ multi_normal(mu, Sigma);

which I run in R using


theModel <- stan_model(file='stan_file.stan')
maxTime <- 10
Y <- mvrnorm(n=1,mu=rep(0,10),Sigma=matrix(3,nrow = maxTime,ncol = maxTime))
theData <- list(maxTime=10,Y=Y)
fittedModel <- optimizing(theModel,theData,hessian = TRUE,as_vector=FALSE)

It throws

Exception: multi_normal_lpdf: LDLT_Factor of covariance parameter is not positive definite.  last conditional variance is 0. 
Error in sampler$call_sampler(c(args, dotlist)) : Initialization failed.

Is there a way to fit models that imply a semi-definite covariance matrix in stan?


It isn’t that “Stan is picky”; it is that multi_normal_lpdf assumes the covariance matrix is positive definite. If you want a log-density that assumes semi-definite, you have to write the function yourself in the functions block.


We did write that multi_normal_lpdf. But the point is that the multivariate normal requires the covariance matrix to be positive definite in order to have a proper distribution. With semidefinite covariance, you’re restricted to a submanifold that will be up to you to enforce with constraints. We’ve found soft constraints get the job done and are better than hard constraints in some of these cases, like in spatial ICAR models, which are also improper due to semidefiniteness, but they only have a single rank of deficiency which can be made up by assuming a sum-to-zero constraint on the parameters.