matrix[K,K] vs cholesky_factor_cov[K]

I’ve got a model in which I am sampling from a covariance matrix, separately from the cholesky factor of the correlation matrix using LKJ, and the diagonal variances as positive scalars as per the recipe here. (Aside – this recipe should probably appear somewhere more prominently than just a deprecation!) In the course of the calculation, I have to create the Cholesky decomposition of the full covariance matrix.

In practice I know that I am often modelling a system with a very small variance in one of the variables, much lower than the related noise variance.

Part of the model is given by

parameters { 
  cholesky_factor_corr[K] L; // correlation matrix cholesky
  vector<lower=0>[K] Cii;    // diagonal variances
  ... 
}

transformed parameters {
  vector[K] sv = sqrt(Cii);  
  cholesky_factor_cov[K] Cchol = diag_pre_multiply(sv, L);   // create covariance cholesky
  ...
}

model {
  L ~ lkj_corr_cholesky(1.0);
 ...
}

With this, I regularly get warnings from the cholesky_factor_cov like

  Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
  Exception: model_bolsxywx_namespace::log_prob: Cchol[2] is 0, but must be positive! (in '/var/folders/f5/bwf9396x1k5c0qgk08cvrbt00000gn/T/httpstan_jihvze_8/model_bolsxywx.stan', line 24, column 2 to column 58)
  If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
  but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

These warnings disappear if I change the cholesky_factor_cov declaration to

  matrix[K,K] Cchol = diag_pre_multiply(sv, L); 

which removes the positive-definiteness check, and no longer gives error messages.

So: Is there any reason to prefer cholesky_factor_cov[K] over matrix[K,K]? I know from the structure of the model that it is always a positive semi-definite matrix — what does the cholesky_factor_cov declaration buy me in this case? Is matrix multiplication more efficient due to the lower-triangle structure? Or is it less efficient due to the positive-definiteness check? Are the error messages actually telling me anything useful?

1 Like

No, the matrix multiplication does not use the structure. The only thing cholesky_factor_cov buys you is a check that you have implemented the model structure correctly.

1 Like

The section of the user guide on hierarchical models is the primary place where this parameterization lives.