# Subsetting covariance matrix for multivariate cholesky

#1

In the Stan manual section 4.5. Missing Multivariate Data, the example likelihood for a model with two outcomes and missing values looks like this:

y[ns12] ~ multi_normal(mu, Sigma);
y[ns1] ~ normal(mu[1], sqrt(Sigma[1, 1]));
y[ns2] ~ normal(mu[2], sqrt(Sigma[2, 2]));


… where the multi_normal is applied for rows with both outcomes present, and the normals applied when only outcome 1 or 2 present respectively.

My question is, if you are using the cholesky decomposition form of the multivariate normal, can this subsetting by done in an analgous fashion ? In other words, if instead of Sigma I have an L_Sigma formed by cholesky decomposition, is it legitimate to do this:

y[ns12] ~ multi_normal_cholesky(mu, L_Sigma);
y[ns1] ~ normal(mu[1], sqrt(L_Sigma[1, 1]));
y[ns2] ~ normal(mu[2], sqrt(L_Sigma[2, 2]));


Or do I need to manipulate L_Sigma first ? Or is it just best to stay with the multi_normal, even though that may come with performance deficits ?

#2

The example in the Missing Multivariate Data section

is not even correct because

Here is a more general version using latents, and R is a two-dimensional binary array indicating whether a cell of the N \times K data matrix \mathbf{X} is observed or missing:

data {
int<lower=1> N;
int<lower=1> N_obs;
int<lower=1> K;
int<lower=0, upper=1> R[N, K];
vector[N_obs] x_o; // only values where R[i,k] == 1
} // x_o must be in “row-major” order!!!
parameters {
// missing values we want the distribution of
vector[N * K - N_obs] x_m;
// common but not very interesting parameters
vector[K] mu;
cholesky_factor_cov[K, K] L;
}
model {
row_vector[K] x_c[N]; // better than a matrix
int obs_mark = 1; int miss_mark = 1;
for (i in 1:N) for (k in 1:K) { // row-major
if (R[i,k] == 1) {
x_c[i,k] = x_o[obs_mark];
obs_mark += 1;
} // observed case above, missing below
else {
x_c[i,k] = x_m[miss_mark];
miss_mark += miss_mark + 1;
}
} // ignore warning about Jacobian
target += multi_normal_cholesky_lpdf(x_c | mu, L);

// priors on mu and L
}


#3

Thanks Ben. What I’m trying to do is not actually a missing values problem but something where I was using the missing values code as a template to work from. However, I think the same basic argument you made for missing values probably also applies so I will rethink things! Many thanks for the informative answer