Subsetting covariance matrix for multivariate cholesky

jroon · March 16, 2019, 2:04pm

In the Stan manual section 4.5. Missing Multivariate Data, the example likelihood for a model with two outcomes and missing values looks like this:

y[ns12] ~ multi_normal(mu, Sigma);
y[ns1] ~ normal(mu[1], sqrt(Sigma[1, 1]));
y[ns2] ~ normal(mu[2], sqrt(Sigma[2, 2]));

… where the multi_normal is applied for rows with both outcomes present, and the normals applied when only outcome 1 or 2 present respectively.

My question is, if you are using the cholesky decomposition form of the multivariate normal, can this subsetting by done in an analgous fashion ? In other words, if instead of Sigma I have an L_Sigma formed by cholesky decomposition, is it legitimate to do this:

y[ns12] ~ multi_normal_cholesky(mu, L_Sigma);
y[ns1] ~ normal(mu[1], sqrt(L_Sigma[1, 1]));
y[ns2] ~ normal(mu[2], sqrt(L_Sigma[2, 2]));

Or do I need to manipulate L_Sigma first ? Or is it just best to stay with the multi_normal, even though that may come with performance deficits ?

bgoodri · March 16, 2019, 3:24pm

The example in the Missing Multivariate Data section

is not even correct because

It does not get the conditional normal distribution right when one margin is observed, which should be something like https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Bivariate_case_2
It does not consider the case where both margins are missing, in which case you would need latents

Here is a more general version using latents, and R is a two-dimensional binary array indicating whether a cell of the N \times K data matrix \mathbf{X} is observed or missing:

data {
  int<lower=1> N; 
  int<lower=1> N_obs;
  int<lower=1> K;
  int<lower=0, upper=1> R[N, K];
  vector[N_obs] x_o; // only values where R[i,k] == 1
} // x_o must be in “row-major” order!!!
parameters {
  // missing values we want the distribution of
  vector[N * K - N_obs] x_m;
  // common but not very interesting parameters
  vector[K] mu; 
  cholesky_factor_cov[K, K] L;
}
model {
  row_vector[K] x_c[N]; // better than a matrix
  int obs_mark = 1; int miss_mark = 1;
  for (i in 1:N) for (k in 1:K) { // row-major
    if (R[i,k] == 1) { 
      x_c[i,k] = x_o[obs_mark];
      obs_mark += 1;
    } // observed case above, missing below
    else {
      x_c[i,k] = x_m[miss_mark];
      miss_mark += miss_mark + 1;
    }
  } // ignore warning about Jacobian
  target += multi_normal_cholesky_lpdf(x_c | mu, L);

  // priors on mu and L
}

jroon · March 16, 2019, 5:05pm

Thanks Ben. What I’m trying to do is not actually a missing values problem but something where I was using the missing values code as a template to work from. However, I think the same basic argument you made for missing values probably also applies so I will rethink things! Many thanks for the informative answer

Topic		Replies	Views
Efficiency of multivariate models with partially missing data General techniques , specification , performance	3	467	April 24, 2023
Multivariate Likelihoods Modeling	10	3057	June 29, 2017
Inference for multinormal on cholesky-transformed data fails to recover sensible estimates Modeling fitting-issues	3	46	January 17, 2025
Diagonal covariance matrix Modeling	5	2545	January 13, 2019
Multivariate normal data covariance estimation Modeling performance , covariance , multivariate-normal	3	1429	May 10, 2020

Subsetting covariance matrix for multivariate cholesky

Related topics