I want to calculate loo for a multinomial regression model. This particular case is a linear model in isometric logratio coordinates, with overdispersion modelled by a multivariate normal in logratio coordinates. We have 125 observations, each of which can be described by a multinomial on 10 categories, with a total count of about 100 for each observation. We have 72 parameters (3 9-dimensional regression coefficient vectors and a 9-dimensional covariance matrix).

I’m getting log_lik values typically around -10. If we had a uniform model, the log_lik would be 100 log(1 / 10), which is about -230. So our model appears to be much better than a uniform model (and graphically, it looks like a good description of the data). But the Pareto k diagnostics are bad:

Pareto k diagnostic values:

Count Pct

(-Inf, 0.5] (good) 0 0.0%

(0.5, 0.7] (ok) 0 0.0%

(0.7, 1] (bad) 37 29.6%

(1, Inf) (very bad) 88 70.4%

Am I doing something wrong, or do I have too many parameters for loo to work well on this size of data set? There is obvious scope for constraining the covariance matrix, if so.

```
functions {
/** inverse ilr transformation of a vector x, using the inverse of the transpose of the V matrix of the ilr (tVinv)
*/
vector ilrinv(matrix tVinv, vector x, int ntaxa) {
vector[ntaxa] z;
vector[ntaxa] y;
z = exp(tVinv * x);
y = z / sum(z);
return y;
}
}
data {
int<lower = 0> ntaxa; // number of taxa
int<lower = 0> nstills; // number of stills
matrix[ntaxa, ntaxa - 1] tVinv; //back-transformation matrix for ilr transformation
int counts[nstills, ntaxa]; //observed counts
vector[nstills] depth; //depth in metres
vector[nstills] squareddepth; //squared-depth in metres
}
transformed data {
int<lower = 1> s;
s = ntaxa - 1;
}
parameters {
vector[s] beta0; //intercept
vector[s] beta1; //depth effect
vector[s] beta2; //squared-depth effect
vector[s] z[nstills]; //transform into predicted logratio coordinates
cholesky_factor_corr[s] LOmega; //Cholesky factor of prior correlation
vector<lower=0>[s] tau; //prior scale on covariances
}
transformed parameters {
cholesky_factor_cov[s] LSigma;
vector[s] x[nstills]; //predicted logratio coordinates
vector[ntaxa] rho[nstills]; //predicted relative abundances
LSigma = diag_pre_multiply(tau, LOmega);
for(i in 1:nstills){
x[i] = beta0 + beta1 * depth[i] + beta2 * squareddepth[i] + LSigma * z[i];
rho[i] = ilrinv(tVinv, x[i], ntaxa);
}
}
model {
for(i in 1:nstills) {
counts[i] ~ multinomial(rho[i]); // observation model
z[i] ~ normal(0, 1);
}
tau ~ cauchy(0, 2.5);
LOmega ~ lkj_corr_cholesky(2);
beta0 ~ cauchy(0, 2.5);
beta1 ~ cauchy(0, 2.5);
beta2 ~ cauchy(0, 2.5);
}
generated quantities{
vector [nstills] log_lik;
for (j in 1:nstills)
log_lik[j] = multinomial_lpmf(counts[j] | rho[j]); //log likelihood for WAIC
}
```