WAIC and loo in multivariate models with different response variables

derekpowell · April 26, 2018, 4:27pm

I’m wondering how waic and loo operate when applied to multivariate models. Specifically, I’m interested in computing and scoring bayesian networks or path models using brms. For instance, asking whether three variables A, B, and C are better explained by a causal chain structure, e.g. A --> B --> C or by a common cause structure, e.g. A <-- B --> C. In the first model, the “response” variables are B and C, with A only acting as a predictor. In the second model, the response variables are A and C, with B only acting as a predictor. In brms formulas:

model1 = mvbf(B ~ A, C ~ B)
model2 = mvbf(A ~ B, C ~ B)

Will I be able to meaningfully compare these models by waic or loo (setting aside concerns about inferring causal relationships)? I ask because, if I were to use the predict() function, brms would be happy to predict B from A in the first model, but it would take more work to do that in the second model. I have it in my head waic and loo internally depend on predict(), but maybe that’s wrong.

I’ve looked at the multivariate vignette where loo is used, but there the models being compared always concern the same response variables.

Thanks!

brms Version: 2.2

paul.buerkner · April 26, 2018, 4:29pm

WAIC or LOO are only valid if models use the exact same responses, which does not seem to the true for your example.

jonah · April 26, 2018, 4:41pm

Like @paul.buerkner said, you won’t be able to use LOO or WAIC to compare models with different outcomes. The reason for that is simply that both LOO and WAIC are methods for estimating predictive performance, but models with different outcomes are predicting different things.

paul.buerkner · April 26, 2018, 4:42pm

Thanks Jonah for providing the explanation. I guess sometimes my answers are a bit too short…

derekpowell · April 26, 2018, 4:43pm

Roger that. Would a solution be to add a prediction for the left-out variable to the model? So my models would become:

model1 = mvbf(B ~ A, C ~ B, A ~ 1)
model2 = mvbf(A ~ B, C ~ B, B ~ 1)

This seems to be the way bayesian networks are scored, from what I can understand.

paul.buerkner · April 26, 2018, 4:44pm

It could be an option, yes. To be sure WAIC and LOO do the right thing, I would probably put the response variables in the same order for both models.

derekpowell · April 26, 2018, 4:52pm

Great. As for order, are there any recommendations (e.g., endogenous -> exogenous, etc) or just ensure they are the same?

paul.buerkner · April 26, 2018, 4:54pm

I don’t think it will matter at all, but to avoid potential problems in my implementation (which are hopefully not there), I would just go for the same order (no matter what the order is).

derekpowell · April 26, 2018, 5:00pm

Gotcha. Thanks so much for your incredibly speedy help (and for brms of course).

eric.ward · May 8, 2018, 10:43pm

I have a slightly different flavor of Derek’s question, but also using WAIC / LOOIC with multivariate responses. I’m mostly interested in the case where we have a matrix of data Y and whether for the multivariate case, these metrics need to be calculated on each observation, Y[i,j], or whether the ‘observation’ is the row of Y[i,] (I’ve been doing the latter).

For a simple example, in the case of regression with multiple responses which are assumed to be
Y ~ MVN(mu, Sigma),
If Sigma is diagonal, the likelihood for each datapoint could be included as
log_lik[i] = normal_lpdf(…)

But if off-diagonal elements of Sigma are != 0, I was under the impression that we need to calculate
log_lik[i] = multi_normal_cholesky_lpdf(…)

is this correct, or do we need to calculate the marginal for each Y[i,j]?

I think the brms package also does the latter, but appears to do so after estimation:

github.com

paul-buerkner/brms/blob/7d10a01912bc643e9ae6674742b52d17b86586e6/R/loglik.R#L48-L57


loglik_multi_gaussian <- function(n, data, samples, link) {
nobs <- data$N_trait * data$K_trait
nsamples <- nrow(samples$eta)
obs <- seq(n, nobs, data$N_trait)
out <- sapply(1:nsamples, function(i) 
  dmultinormal(data$Y[n,], Sigma = samples$Sigma[i, , ], log = TRUE,
               mu = ilink(samples$eta[i, obs], link)))
# no truncation allowed
weight_loglik(out, n = n, data = data)
}

avehtari · May 9, 2018, 6:27am

This depends on what is your prediction task and choose the one which matches that.

It is correct if you want to know the predictive performance jointly.

If your prediction task is to predict single observations instead, check a new case study for PSIS-LOO for multivariate Gaussian likelihoods Leave-one-out cross-validation for non-factorizable models • loo

At least in some part brms used to do something wrong with PSIS-LOO for multivariate Gaussian likelihood, and it’s possible that it’s not yet fixed in the latest release.

paul.buerkner · May 9, 2018, 6:37am

I believe brms is doing the right thing when computing the log-likelihood of multivariate models. If the purpose is to get one (joint) likelihood per observation (i.e. across response variables), brms will use the multivariate normal density if Sigma is not diagnonal and a simple product of univariate normal densities if Sigma is diagonal.

You can extract log-likelihood values of a single response variable by means of the resp argument.

Topic		Replies	Views
Model comparison in latent variable models brms loo	1	1085	May 14, 2021
Model stacking and LOO (brms models) brms loo	11	7262	June 29, 2018
Information criteria for multivariate probit Modeling loo	7	633	December 20, 2018
Information criteria for multivariate models Modeling techniques , loo	7	935	April 23, 2018
Comparing Bayesian and Frequentist Models brms loo	3	986	August 15, 2018

WAIC and loo in multivariate models with different response variables

Related topics