Dealing with constraints on correlations

donny · October 30, 2019, 6:25pm

I am wondering how Stan handles constraints on correlations. For example, with a 3 by 3 matrix, the correlations cannot be uniform between -1 and 1. Yet, when I fit this size correlation matrix, I do not receive any error unless I include generated quantities which states that a value has been rejected. I take this to mean that a whether the matrix is positive definite is evaluated and then rejected if not before proceeding. And hence, the sampler does not throw an error and stop.

Note this model is for a simple example, which I hope to scale to a 4 by 4 random effects co-variance matrix. Also, while I do know that the LKJ is currently recommended by the Stan team, in my application I need independent priors for each correlation.

model <- '

// Pearson Correlation
data { 
int<lower=0> n;
vector[3] x[n];
}
parameters {
vector[3] mu;
vector<lower=0>[3] sigma;

vector<lower=-1,upper=1>[3] r;
} 
transformed parameters {
cov_matrix[3] S;
// Reparameterization
S[1,1] = square(sigma[1]);
S[1,2] = r[1] * sigma[1] * sigma[2];
S[2,1] = r[1] * sigma[1] * sigma[2];

S[1,3] = r[2] * sigma[1] * sigma[3];
S[3,1] = r[2] * sigma[3] * sigma[1];

S[2,3] = r[3] * sigma[2] * sigma[3];
S[3,2] = r[3] * sigma[3] * sigma[2];

S[2,2] = square(sigma[2]);
S[3,3] = square(sigma[3]);
}
model {
// Priors
mu ~ normal(0, 100);
sigma[1] ~ cauchy(0,3);
sigma[2] ~ cauchy(0,3);
sigma[3] ~ cauchy(0,3);
r[1] ~ uniform(-1, 1);
r[2] ~ uniform(-1, 1);
r[3] ~ uniform(-1, 1);

// Data
x ~ multi_normal(mu, S);

}
generated quantities{
vector[3] ppc;
ppc = multi_normal_rng(mu, S);

}'

Here is the R code:

x <- scale(mtcars[,1:3])

mod <- stan_model(model_code = model, verbose = TRUE)

fit <- sampling(mod,   data=list(x = x, n = nrow(x)))

martinmodrak · November 2, 2019, 10:42am

Hi,
I would expect that cov_matrix in transformed parameters results in a check of symmetricity and pos. definitiness at the end of the block, but maybe it does not check positive definiteness there, @Bob_Carpenter likely has better understanding of what exactly is tested when.

Generally the way to go is to construct your transformed parameters so that they always satisfy the constraints you put on them. If you need to compute a covariance matrix, make sure that for any value of sigma, S is symmetric and positive definite (one construction to do this is shown in the manual at 10.10 Covariance Matrices | Stan Reference Manual)

This is IMHO an impossible goal. Take for example a covariance matrix of the form:

\begin{pmatrix} 2 & 1 & x \\ 1 & 2 & 1 \\ x & 1 & 2 \end{pmatrix}

This is a valid covariance matrix only for -1 < x < 2 (just checked numerically, I am not that good at math :-) ), so once some correlations are known, others are constrained, i.e. correlations are not independent.

Hope that helps!

Max_Mantei · November 2, 2019, 11:44am

You can try to build the Cholesky decomposed correlation (and covariance) matrix directly. Check out the Wikipedia page to see what goes where. This way you can put priors on the “primitive” parameters like individual rhos and sigmas. In the case above it would be something like this:

data {
  
  int<lower=0> n;
  vector[3] x[n];
  
}
parameters {
  
  vector[3] mu;
  vector<lower=0>[3] sigma;
  real<lower=0,upper=1> rho_raw[3];
  
}
transformed parameters{
  
  real<lower=-1,upper=1> rho[3];
  
  matrix[3,3] L_Omega = rep_matrix(0.0, 3, 3);
  matrix[3,3] L_Sigma;
  matrix[3,3] Omega;
  matrix[3,3] Sigma;
  
  for(i in 1:3)
    rho[i] = 2*(rho_raw[i] - 0.5);

  // this is very explicit -- you can probably write a concise function for this...
  L_Omega[1,1] = 1;
  L_Omega[2,1] = rho[1];
  L_Omega[3,1] = rho[2];
  
  L_Omega[2,2] = sqrt(1 - square(rho[1]));
  L_Omega[3,2] = (rho[3] - rho[1]*rho[2]) / L_Omega[2,2];
  L_Omega[3,3] = sqrt(1 - square(L_Omega[3,1]) - square(L_Omega[3,2]));
  
  L_Sigma = diag_pre_multiply(sigma, L_Omega);
  
  Omega = multiply_lower_tri_self_transpose(L_Omega);
  Sigma = quad_form_diag(Omega, sigma);
  
}
model {
  
  x ~ multi_normal_cholesky(mu, L_Sigma);
  
  rho_raw[1] ~ beta(3,2);
  rho_raw[2] ~ beta(5,7);
  rho_raw[3] ~ beta(3,4);
  
  sigma[1] ~ cauchy(0,3);
  sigma[2] ~ cauchy(0,3);
  sigma[3] ~ cauchy(0,3);
  
  mu ~ normal(0, 100);
    
}
generated quantities{
  
  vector[3] ppc = multi_normal_rng(mu, Sigma);
  
}

To find a suitable prior for rho_raw I usually do something like this:

N <- 1e5
hist(2 * (rbeta(N, 3, 4) * 0.5)) # switch out 3, 4...

Cheers!

Topic		Replies	Views
Ensure correlation matrix is positive semi-definite? Modeling	10	2219	June 25, 2021
How does Stan sample correlation matrices? General	7	659	March 2, 2023
Priors on Constrained Covariance Matrix Modeling	7	914	April 13, 2018
Specifying Correlation in Stan Modeling	10	1058	October 9, 2021
Correlation matrix with positively constrained off-diagonals Modeling	3	948	July 12, 2021

Dealing with constraints on correlations

Related topics