WAIC and loo in multivariate models with different response variables

loo

#1

I’m wondering how waic and loo operate when applied to multivariate models. Specifically, I’m interested in computing and scoring bayesian networks or path models using brms. For instance, asking whether three variables A, B, and C are better explained by a causal chain structure, e.g. A --> B --> C or by a common cause structure, e.g. A <-- B --> C. In the first model, the “response” variables are B and C, with A only acting as a predictor. In the second model, the response variables are A and C, with B only acting as a predictor. In brms formulas:

model1 = mvbf(B ~ A, C ~ B)
model2 = mvbf(A ~ B, C ~ B)

Will I be able to meaningfully compare these models by waic or loo (setting aside concerns about inferring causal relationships)? I ask because, if I were to use the predict() function, brms would be happy to predict B from A in the first model, but it would take more work to do that in the second model. I have it in my head waic and loo internally depend on predict(), but maybe that’s wrong.

I’ve looked at the multivariate vignette where loo is used, but there the models being compared always concern the same response variables.

Thanks!

  • brms Version: 2.2

#2

WAIC or LOO are only valid if models use the exact same responses, which does not seem to the true for your example.


#3

Like @paul.buerkner said, you won’t be able to use LOO or WAIC to compare models with different outcomes. The reason for that is simply that both LOO and WAIC are methods for estimating predictive performance, but models with different outcomes are predicting different things.


#4

Thanks Jonah for providing the explanation. I guess sometimes my answers are a bit too short…


#5

Roger that. Would a solution be to add a prediction for the left-out variable to the model? So my models would become:

model1 = mvbf(B ~ A, C ~ B, A ~ 1)
model2 = mvbf(A ~ B, C ~ B, B ~ 1)

This seems to be the way bayesian networks are scored, from what I can understand.


#6

It could be an option, yes. To be sure WAIC and LOO do the right thing, I would probably put the response variables in the same order for both models.


#7

Great. As for order, are there any recommendations (e.g., endogenous -> exogenous, etc) or just ensure they are the same?


#8

I don’t think it will matter at all, but to avoid potential problems in my implementation (which are hopefully not there), I would just go for the same order (no matter what the order is).


#9

Gotcha. Thanks so much for your incredibly speedy help (and for brms of course).


#10

I have a slightly different flavor of Derek’s question, but also using WAIC / LOOIC with multivariate responses. I’m mostly interested in the case where we have a matrix of data Y and whether for the multivariate case, these metrics need to be calculated on each observation, Y[i,j], or whether the ‘observation’ is the row of Y[i,] (I’ve been doing the latter).

For a simple example, in the case of regression with multiple responses which are assumed to be
Y ~ MVN(mu, Sigma),
If Sigma is diagonal, the likelihood for each datapoint could be included as
log_lik[i] = normal_lpdf(…)

But if off-diagonal elements of Sigma are != 0, I was under the impression that we need to calculate
log_lik[i] = multi_normal_cholesky_lpdf(…)

  • is this correct, or do we need to calculate the marginal for each Y[i,j]?

I think the brms package also does the latter, but appears to do so after estimation:


#11

This depends on what is your prediction task and choose the one which matches that.

It is correct if you want to know the predictive performance jointly.

If your prediction task is to predict single observations instead, check a new case study for PSIS-LOO for multivariate Gaussian likelihoods http://mc-stan.org/loo/articles/loo2-non-factorizable.html

At least in some part brms used to do something wrong with PSIS-LOO for multivariate Gaussian likelihood, and it’s possible that it’s not yet fixed in the latest release.


#12

I believe brms is doing the right thing when computing the log-likelihood of multivariate models. If the purpose is to get one (joint) likelihood per observation (i.e. across response variables), brms will use the multivariate normal density if Sigma is not diagnonal and a simple product of univariate normal densities if Sigma is diagonal.

You can extract log-likelihood values of a single response variable by means of the resp argument.