Is the kfold method in brms/rstanarm similar to kfold validation in machine learning?


I have use the kfold function in brms, and I don’t know it just split the dataset into 5 folds, and use each fold to fit data;
or work as the kfold cv in machine learning, which split into training dataset and testing dataset? Thanks!

1 Like

When using the kfold method in brms or rstanarm then if K=5 it splits the data into 5 sets and uses each of them as the test data once. That is, it will fit the model 5 times, each time leaving out one of the 5 sets and then evaluating how well it can predict the left out set using the model fit to the rest of the data.

Hope that helps!

Thank you Jonah.
Can I ask another question? If I use kfold_predict, when I select method ‘predicted’, then it uses the holdout data, and when I use method ‘fitted’, then it uses training data, am I right? Thanks

I think the distinction between predicted and fitted is different than that. Let’s take linear regression for example. In that case the fitted values would be alpha + X * beta (ignoring sigma), whereas predicted would draw from normal(alpha + X * beta, sigma). In both cases kfold_predict is presumably building up a set of combined predictions/fitted values by taking predictions/fitted values from each of the K models fit. @paul.buerkner Is that right?

But this is still being done in terms of the log-predictive density, not actual predicted values like one might do in a typical machine learning setting?

Hello, I have checked the source code of the function kfold_predict:

function (x, method = c("predict", "fitted"), resp = NULL, ...) 
    if (!inherits(x, "kfold")) {
        stop2("'x' must be a 'kfold' object.")
    if (!all(c("fits", "data") %in% names(x))) {
        stop2("Slots 'fits' and 'data' are required. ", "Please run kfold with 'save_fits = TRUE'.")
    method <- get(match.arg(method), mode = "function")
    resp <- validate_resp(resp, x$fits[[1, "fit"]], multiple = FALSE)
    all_predicted <- as.character(sort(unlist(x$fits[, "predicted"])))
    npredicted <- length(all_predicted)
    nsamples <- nsamples(x$fits[[1, "fit"]])
    y <- rep(NA, npredicted)
    yrep <- matrix(NA, nrow = nsamples, ncol = npredicted)
    names(y) <- colnames(yrep) <- all_predicted
    for (k in seq_rows(x$fits)) {
        fit_k <- x$fits[[k, "fit"]]
        predicted_k <- x$fits[[k, "predicted"]]
        obs_names <- as.character(predicted_k)
        newdata <- x$data[predicted_k, , drop = FALSE]
        y[obs_names] <- get_y(fit_k, resp, newdata = newdata, 
        yrep[, obs_names] <- method(fit_k, newdata = newdata, 
            resp = resp, allow_new_levels = TRUE, summary = FALSE, 
    nlist(y, yrep)

It seems that all predictions are based on predicted dataset (testing dataset).
And method has no relationship with the dataset predicted…Thanks

[edit: escaped code]

Using log density (or square loss) is a proper scoring rule. That’s a good thing if you care about probabilistic prediction.

0/1 loss is improper. What you often see in ML is systems trained on log loss (penalized MLE or MAP) then evaluated on 0/1 loss, sometimes with sweeping thresholds to give you AUC.