Hi all,
I wanted to follow up on this a bit to check my understanding of the PSIS-loo results I’m getting for this model. I’ve extended the model above to take a matrix of predictors X, with correlated varying effects for each species for each predictor. At a quick count with 6 predictor variables (and no species-specific intercept), there are 209 parameters, which I understand is a lot compared to the number of data points (336).
Running model$loo() gives the following, with a good number of Pareto k values > 0.7 (the intercept only model above gives all points >0.7…!):
Computed from 4000 by 336 log-likelihood matrix
Estimate SE
elpd_loo -674.7 34.5
p_loo 99.3 7.7
looic 1349.4 69.0
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 180 53.6% 208
(0.5, 0.7] (ok) 103 30.7% 70
(0.7, 1] (bad) 47 14.0% 13
(1, Inf) (very bad) 6 1.8% 6
See help('pareto-k-diagnostic') for details.
Having done a bit of searching around Discourse, I found @avehtari’s comment below which I think sums up what’s going on here:
Here, p_loo is about half the actual number of parameters, which is nice (and suggests some regularisation is happening), but the actual number of parameters is much bigger than n/5 (67.2), which I take to read that the model is highly flexible.
To test this, I added more strongly regularising priors on the variance components (going from exponential(1) to exponential(2)), which reduced the number of observations with Pareto k > 0.7, suggesting that the problems are to do with model flexibility:
Computed from 4000 by 336 log-likelihood matrix
Estimate SE
elpd_loo -676.6 34.4
p_loo 98.7 7.7
looic 1353.1 68.9
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 174 51.8% 212
(0.5, 0.7] (ok) 111 33.0% 46
(0.7, 1] (bad) 49 14.6% 8
(1, Inf) (very bad) 2 0.6% 12
See help('pareto-k-diagnostic') for details.
However, increasing these rate parameters further in general seems a poor solution to the problem, as 1) increasing them to very high levels doesn’t remove high Pareto k values, and also results in high numbers of divergent transitions, and 2) highly constrained variances in general would in my mind not fall in line with domain-specific knowledge, where species often differ strongly in their responses to the environment - the model has to allow the flexibility to estimate these accurately.
Is my understanding correct that this is a problem of model flexibility? For a too-flexible model, can these problems with PSIS-LOO be side-stepped by using other cross-validation techniques (e.g. k-fold CV), or are these also invalid under these conditions? It would be good in general (outside of my solution above using a permanently held-out sample) to be able to compare the predictive ability of these models with different covariates and increasing complexity (an option I’d like to explore are phylogenetic varying effects), where the data perhaps don’t so easily permit a hold-out set.