Hi all,

I’m having difficulty figuring out how to get uncertainty on classification accuracy in a *brms* model. I have binary data with balanced classes, and I fit a k-fold-ed regression model to my data like so:

```
b_model <- brm(One_if_high ~ scale(uni_prob)*scale(bi_prob_smoothed),
data = infant_2a_gold,
family = bernoulli(),
cores = 4,
chains = 4,
warmup = 3000,
iter = 30000,
thin = 4,
seed = 1,
save_model = paste0("./CognitionModelFits/gold_2a.stan"),
file = paste0("./CognitionModelFits/gold_2a_fit"),
file_refit = "on_change",
control = list(max_treedepth =15), backend = "cmdstan", silent = 0
)
```

and I then use the built-in kfold tools from *brms* to get a dataframe of cross-validated predictions, like so:

```
kf_unigram_prob <- kfold(b_model, chains = 4, cores = 4, k = 10, save_fits = TRUE, seed = 1, folds = "stratified", group ="One_if_high")
kfp <- kfold_predict(kf_unigram_prob)
```

and process the binary draws like so:

```
preds <- kfp$yrep %>%
as.data.frame() %>%
rownames_to_column() %>%
rename(DrawNum = rowname) %>%
pivot_longer(cols = -DrawNum,
names_to = "ItemClassID",
values_to = "Predicted") %>%
mutate(ItemClassID = str_remove(ItemClassID,"V"))
actuals <- infant_2a_gold %>%
rownames_to_column() %>%
select(List,rowname,Arpabet,One_if_high) %>%
rename(ItemClassID = rowname,
GroundTruth = One_if_high)
acc <- left_join(preds, actuals)%>%
mutate(Correct = if_else(Predicted == GroundTruth, 1,0)) %>%
group_by(Arpabet) %>%
summarise(p_correct = sum(Correct)/length(Correct)) %>%
mutate(BinaryPred = if_else(p_correct>0.5,1,0)) %>%
ungroup() %>%
summarise(mean_of_binary_pred = mean(BinaryPred))
```

This gives me a point estimate of 0.62 for classification accuracy, which is near-identical to the frequentist model I fit for sanity-check as I was going through this.

If I try to leverage draws to get uncertainty in classification - using code below, grouping by draws and getting the overall accuracy on each draw to form a distribution, this gives a different summary statistic - 0.59 - which represents a discrepency that is (as far as I know) not the result of run-to-run simulation error. That looks like this:

```
acc <- left_join(preds, actuals)%>%
mutate(Correct = if_else(Predicted == GroundTruth, 1,0)) %>%
group_by(DrawNum) %>%
summarise(p_correct = sum(Correct)/length(Correct)) %>%
mean_hdi(p_correct)
```

What can I do to get uncertainty in the point estimate that matches the frequentist model? (I have reasons to think this is actually the right estimate, from other similar models on similar data // elsewhere in this project). Also, it’s relevant that `Arpabet`

is an item-level variable, so I can’t group by draw and it.

Thanks so much in advance!