Information criteria for multivariate models

#1

Hi!

I am actually exploring multivariate latent variable models, and I am wondering how to compute information criteria (loo, waic) for such models. I have an hard time to find references dealing with such questions, except this one, which presents criteria I never heard of.

Specifically, I can’t figure out how to write the likelihood computation in the generated quantity block, and the type and dimensions of output (array of matrices?).

I know that brms can compute such criteria, but the likelihood computation is made after model fitting, so I can’t use it to produce an exemple of correct generated quantity block.

Any help?

Thank you!
Lucas

Information criteria for multivariate response model
#2

Hi Lucas,

There exist many multivariate latent variable models, could you be more specific? Also if you would show Stan code for your model, it would be easier to say how to write the likelihood computation in the generated quantity block.

#3

Hi!

Here is the stan code for a model including covariates. We already discussed these kinds of model (close to bayesian factor analysis) here.

``````data{
int N; // sample size
int P; // number of species
int K; //Number of predictors
int D; // number of dimensions
matrix[N,K] X; //Predictor matrix
int<lower=0> Y[N,P]; // data matrix of order [N,P]
}
transformed data{
int<lower=1> M;
vector[D] FS_mu; // factor means
vector<lower=0>[D] FS_sd; // factor standard deviations
for (m in 1:D) {
FS_mu[m] = 0; //Mean of factors = 0
FS_sd[m] = 1; //Sd of factors = 1
}
}
parameters{
//Parameters
vector[N] alpha; //Row intercepts
row_vector[P] b0; // Intercept per species
matrix[K,P] b; //Coefficient per species
matrix[N,D] FS; //Factor scores, matrix of order [N,D]
cholesky_factor_corr[D] Rho; //Correlation matrix between factors
//Hyperparameters
real<lower=0> sigma_a; //Variance of the row intercepts
real<lower=0> sigma_b0; //Variance of the species intercepts
vector<lower=0>[K] sigma_b; //Variance of the species slopes
real<lower=0> eta; //Parameter of LKJ prior for Rho
}
transformed parameters{
cholesky_factor_cov[P,D] lambda; //Final matrix of laodings
matrix[D,D] Ld; // cholesky decomposition of the covariance matrix between factors
matrix[N,P] temp; //intermediate predictor
matrix<lower=0>[N,P] mu; // predicted values

// Correlation matrix of factors
Ld = diag_pre_multiply(FS_sd, Rho);

{
idx2 = 0;

for(i in 1:(D-1)) { for(j in (i+1):(D)){ lambda[i,j] = 0; } } //0 on upper diagonal
for(i in 1:D) lambda[i,i] = L_diag[i]; //Positive values on diagonal
for(j in 1:D) {
for(i in (j+1):P) {
idx2 = idx2+1;
lambda[i,j] = L_lower[idx2];
}
}
}

// Predictor
temp = FS * lambda' + X * b;
for(n in 1:N) mu[n] = exp(alpha[n] + b0 + temp[n]);

}
model{
// Hyperpriors
sigma_a ~ gamma(2,0.1); //Row intercepts hyperprior
sigma_b0 ~ gamma(2,0.1); //Species intercept effects hyperprior
for(k in 1:K) sigma_b[k] ~ gamma(2,0.1); //Species slopes hyperprior
eta ~ gamma(2,0.1); //Parameter of the cholesky prior for FS correlation structure

// Priors
alpha ~ normal(0, sigma_a); //Regularizing prior for row intercepts
b0 ~ normal(0, sigma_b0); //Regularizing prior for species intercepts
for(k in 1:K) b[k,] ~ normal(0, sigma_b[k]); //Regularizing prior for species slopes
//L_diag ~ normal(0,1); //Prior for diagonal elements
for(d in 1:D) L_diag[d] ~ chi_square(N-d);
Rho ~ lkj_corr_cholesky(eta); //Regularizing prior for Rho

for(i in 1:N){
Y[i,] ~ poisson(mu[i]);
FS[i,] ~ multi_normal_cholesky(FS_mu, Ld);
}
}
generated quantities{
matrix[P,P] cov_lambda;

cov_lambda = multiply_lower_tri_self_transpose(lambda);
}
``````

#4

Each element in `Y[i,]` is iid from `poisson(mu[i])`, so loop over `i` and `j` and compute

``````log_lik[i,j] = poisson_lpmf(Y[i,j] | mu[i]);
``````

after the loop you may also transform that to a vector with `to_vector` if you want to have it in the shape `loo` package likes.

#5

As simple as that! Thank you very much!

#6

I wrote this generated quantity block :

``````generated quantities{
matrix[P,P] cov_lambda;
matrix[N,P] log_lik1;
vector[N*P] log_lik;

cov_lambda = multiply_lower_tri_self_transpose(lambda); //Create the covariance matrix

//Compute the likelihood for each observation
for(n in 1:N){
for(p in 1:P){
log_lik1[n,p] = poisson_lpmf(Y[n,p] | mu[n,p]);
}
}
log_lik = to_vector(log_lik1); //Tranform in a vector usable by loo package
}
``````

but when I tried to compute loo :

covX_log_lik <- extract_log_lik(covX_stan, merge_chains = F)
covX_r_eff <- relative_eff(exp(covX_log_lik))
(covX_loo <- loo(covX_log_lik, r_eff = covX_r_eff))

I get this output :
`Computed from 3000 by 336 log-likelihood matrix`

``````       Estimate   SE
elpd_loo   -779.0 47.2
p_loo       284.8 27.0
looic      1558.0 94.3
------
Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     124   36.9%   1
(0.5, 0.7]   (ok)        75   22.3%   0
(0.7, 1]   (bad)       90   26.8%   0
(1, Inf)   (very bad)  47   14.0%   0
See help('pareto-k-diagnostic') for details.
There were 29 warnings (use warnings() to see them)
warnings()
Messages d'avis :
1: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.

2: In log(z) : production de NaN
3: In log(z) : production de NaN
4: In log(z) : production de NaN
...
``````

When I did what you suggested, that is

``````generated quantities{
matrix[P,P] cov_lambda;
matrix[N,P] log_lik1;
vector[N*P] log_lik;

cov_lambda = multiply_lower_tri_self_transpose(lambda); //Create the covariance matrix

//Compute the likelihood for each observation
for(n in 1:N){
for(p in 1:P){
log_lik1[n,p] = poisson_lpmf(Y[n,p] | mu[n]);
}
}
log_lik = to_vector(log_lik1); //Tranform in a vector usable by loo package
}
``````

The loo computation was impossible, and returned NA only.

#7

Sorry, I didn’t notice that mu is a matrix since you had

``````Y[i,] ~ poisson(mu[i]);
``````

``````Y[i,] ~ poisson(mu[i,]);
If `Y` has 336 observations, and `mu` has 336 unknowns then this is possible, and implies that your prior is weak (too flexible) or data is overdispersed compared to Poisson.