I’m wondering how waic and loo operate when applied to multivariate models. Specifically, I’m interested in computing and scoring bayesian networks or path models using brms. For instance, asking whether three variables A, B, and C are better explained by a causal chain structure, e.g. A --> B --> C or by a common cause structure, e.g. A <-- B --> C. In the first model, the “response” variables are B and C, with A only acting as a predictor. In the second model, the response variables are A and C, with B only acting as a predictor. In brms formulas:
model1 = mvbf(B ~ A, C ~ B)
model2 = mvbf(A ~ B, C ~ B)
Will I be able to meaningfully compare these models by waic or loo (setting aside concerns about inferring causal relationships)? I ask because, if I were to use the predict() function, brms would be happy to predict B from A in the first model, but it would take more work to do that in the second model. I have it in my head waic and loo internally depend on predict(), but maybe that’s wrong.
I’ve looked at the multivariate vignette where loo is used, but there the models being compared always concern the same response variables.
Like @paul.buerkner said, you won’t be able to use LOO or WAIC to compare models with different outcomes. The reason for that is simply that both LOO and WAIC are methods for estimating predictive performance, but models with different outcomes are predicting different things.
I don’t think it will matter at all, but to avoid potential problems in my implementation (which are hopefully not there), I would just go for the same order (no matter what the order is).
I have a slightly different flavor of Derek’s question, but also using WAIC / LOOIC with multivariate responses. I’m mostly interested in the case where we have a matrix of data Y and whether for the multivariate case, these metrics need to be calculated on each observation, Y[i,j], or whether the ‘observation’ is the row of Y[i,] (I’ve been doing the latter).
For a simple example, in the case of regression with multiple responses which are assumed to be
Y ~ MVN(mu, Sigma),
If Sigma is diagonal, the likelihood for each datapoint could be included as
log_lik[i] = normal_lpdf(…)
But if off-diagonal elements of Sigma are != 0, I was under the impression that we need to calculate
log_lik[i] = multi_normal_cholesky_lpdf(…)
is this correct, or do we need to calculate the marginal for each Y[i,j]?
I think the brms package also does the latter, but appears to do so after estimation:
At least in some part brms used to do something wrong with PSIS-LOO for multivariate Gaussian likelihood, and it’s possible that it’s not yet fixed in the latest release.
I believe brms is doing the right thing when computing the log-likelihood of multivariate models. If the purpose is to get one (joint) likelihood per observation (i.e. across response variables), brms will use the multivariate normal density if Sigma is not diagnonal and a simple product of univariate normal densities if Sigma is diagonal.
You can extract log-likelihood values of a single response variable by means of the resp argument.