Hi,
@avehtari is the authority on this topic, but I will try to respond as best as I can and hope he corrects me if I got stuff wrong :-)
The biggest fear I have with your proposed scheme is that it only tackles the uncertainty arising from the different splits in K-fold, but not the uncertainty arising from the uncertainty in the data generating process itself.
Also, if you have the computing power to run 100 times the k-fold procedure, then you might as well have the computing power to compute exact leave-one-out cross validation (assuming your dataset has not much more than k * 100 data points). Or at least increase k noticeably. I would bet that in most cases having one run with large k is better then having many runs with small k (at least if predicting one future observation is the prediction task you care about).
In any case there are methods to compute elpd and its SE from exact k-fold/loo cross validation, which should IMHO be superior to the approach you propose, but they are not AFAIK well documented and are a bit opaque. The good part is that the results can then be interpreted in the same way as those from loo
sparing you the kind of questions you are having.
The code I have used in my work is from StanCon: https://github.com/stan-dev/stancon_talks/blob/master/2017/Contributed-Talks/07_nicenboim/kfold.Rmd
Where the model computes the likelihood for held-out datapoints in generated quantities
. See their models for the way they did it.
I had to update the R code a little to work with modern versions of the loo
package to:
#Functions extract_log_lik and kfold taken from
#https://github.com/stan-dev/stancon_talks/blob/master/2017/Contributed-Talks/07_nicenboim/kfold.Rmd
extract_log_lik_K <- function(list_of_stanfits, list_of_holdout, ...){
K <- length(list_of_stanfits)
list_of_log_liks <- plyr::llply(1:K, function(k){
extract_log_lik(list_of_stanfits[[k]],...)
})
# `log_lik_heldout` will include the loglike of all the held out data of all the folds.
# We define `log_lik_heldout` as a (samples x N_obs) matrix
# (similar to each log_lik matrix)
log_lik_heldout <- list_of_log_liks[[1]] * NA
for(k in 1:K){
log_lik <- list_of_log_liks[[k]]
samples <- dim(log_lik)[1]
N_obs <- dim(log_lik)[2]
# This is a matrix with the same size as log_lik_heldout
# with 1 if the data was held out in the fold k
heldout <- matrix(rep(list_of_holdout[[k]], each = samples), nrow = samples)
# Sanity check that the previous log_lik is not being overwritten:
if(any(!is.na(log_lik_heldout[heldout==1]))){
warning("Heldout log_lik has been overwritten!!!!")
}
# We save here the log_lik of the fold k in the matrix:
log_lik_heldout[heldout==1] <- log_lik[heldout==1]
}
return(log_lik_heldout)
}
kfold <- function(log_lik_heldout) {
library(matrixStats)
logColMeansExp <- function(x) {
# should be more stable than log(colMeans(exp(x)))
S <- nrow(x)
colLogSumExps(x) - log(S)
}
# See equation (20) of @VehtariEtAl2016
pointwise <- matrix(logColMeansExp(log_lik_heldout), ncol= 1)
colnames(pointwise) <- "elpd"
# See equation (21) of @VehtariEtAl2016
elpd_kfold <- sum(pointwise)
se_elpd_kfold <- sqrt(ncol(log_lik_heldout) * var(pointwise))
estimates <- matrix(NA_real_, nrow = 1, ncol = 2)
rownames(estimates) <- "elpd_loo"
colnames(estimates) <- c("Estimate","SE")
estimates["elpd_loo", "Estimate"] <- elpd_kfold
estimates["elpd_loo", "SE"] <- se_elpd_kfold
out <- list(
pointwise = pointwise,
estimates = estimates
)
structure(out, class = "loo")
}
Also brms supports it: https://rdrr.io/cran/brms/man/kfold.brmsfit.html so you might be able to steal their code.
Hope that helps.