Error using kfold on stan_gamm4 object

Hi all,
I’ve been playing around with fitting models in rstanarm::stan_gamm4().The models fit fine but when I use loo I receive warning messages suggesting I use kfold with K=10 (see MWE below). However, when I use kfold it produces an error that indicates it cannot find the random effect variable (in this case fac). Am I missing something obvious here?

Package versions:
rstanarm (2.19.3)
rstantools (2.0.0)
loo (2.2.0)

MWE:

library(rstanarm)
library(mgcv)

#Simulate a model
set.seed(200) 
dat <- gamSim(6, n=200, scale=2)
# Fit the stan model

#options(mc.cores=1) # Note this doesn't work in Rstudio 1.2.5042 with R 4.0.0
fit <- rstanarm::stan_gamm4(y ~ s(x0) + s(x1) + s(x2) + s(x3), data =  dat, random = ~(1|fac))

# Loo
loo_fit <- loo(fit)
#Warning message:
#Found 200 observations with a pareto_k > 0.7. With this many problematic observations we #recommend calling 'kfold' with argument 'K=10' to perform 10-fold cross-validation rather than LOO.

#Kfold
kfold(fit, K=10)
#Fitting model 1 out of 10
#Error in eval(predvars, data, env) : object 'fac' not found

Note that the above also happens if I supply a folds to the kfold function using one of the convenience functions like loo::kfold_split_grouped()

That is probably a bug in rstanarm, but if all the observations have high pareto_k values, the model is probably overfitting anyway.

Yep that’s just a minimum working example. The actual models have about (~3% high pareto_k values).

If you can open an issue on rstanarm’s GitHub repository, we are somewhat more likely to remember to fix it. @jonah has fixed things like this several times.

Yep I’ll do it now. Just wanted to make sure it was a bug first and not my incompetence ;P

Thanks for reporting this. I’ll take a look at the bug report and see if I can figure out what is going on.

This is fixed now on github

2 Likes