Hi,
I’m trying to replicate the body fat example (at least the model & selection part) from the Pavone 2022 paper. I have the data file from the github repo and the following code (pretty much identical to the same repo):
library(tidyverse)
library(rstanarm)
library(projpred)
set.seed(5345334)
# get data
df <- read.table("bodyfat.txt", header = T, sep = ";")
# centre & scale all predictor vars
df[,4:19] <- scale(df[,4:19])
df <- as.data.frame(df)
# n obs
n <- nrow(df)
# height & weight colnames are currently height in inches & weight in pounds
# name SI cols as height & weight
colnames(df[c("weight_kg", "height_cm")]) <- c("weight", "height")
# select predictors
pred <- c("age", "weight", "height", "neck", "chest", "abdomen", "hip",
"thigh", "knee", "ankle", "biceps", "forearm", "wrist")
# select outcome
target <- "siri"
# create formula for model
model_formula <- paste("siri~", paste(pred, collapse = "+"))
# n predictors
p <- length(pred)
# dataframe of just outcome & predictors
df <- df[,c(target,pred)]
# set up regularised horseshoe prior
p0 <- 2 # prior guess for the number of relevant variables
tau0 <- p0/(p-p0) * 1/sqrt(n)
rhs_prior <- hs(global_scale=tau0)
# fit model
fit_model <- stan_glm(formula = model_formula, data = df,
prior = rhs_prior,
QR = TRUE,
seed = 1,
refresh = 0)
# perform the projection predictive variable selection
bcvvs <- cv_varsel(fit_model, method='forward', cv_method='LOO', nloo = n,
verbose = FALSE)
cv_varsel()
errors out with
Error in get_refmodel.stanreg(object, ...) :
inherits(formula, "formula") is not TRUE
Any help would be appreciated.
Thanks,
i
R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6.8
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] projpred_2.2.2 rstanarm_2.21.3 Rcpp_1.0.9 forcats_0.5.2 stringr_1.4.1 dplyr_1.0.10 purrr_0.3.5 readr_2.1.3 tidyr_1.2.1
[10] tibble_3.1.8 ggplot2_3.4.0 tidyverse_1.3.2
loaded via a namespace (and not attached):
[1] googledrive_2.0.0 minqa_1.2.5 colorspace_2.0-3 ellipsis_0.3.2 markdown_1.4 base64enc_0.1-3
[7] fs_1.5.2 rstudioapi_0.14 rstan_2.21.7 DT_0.26 fansi_1.0.3 lubridate_1.9.0
[13] xml2_1.3.3 codetools_0.2-18 splines_4.1.1 shinythemes_1.2.0 bayesplot_1.10.0 jsonlite_1.8.3
[19] nloptr_2.0.3 broom_1.0.1 dbplyr_2.2.1 shiny_1.7.3 compiler_4.1.1 httr_1.4.4
[25] backports_1.4.1 assertthat_0.2.1 Matrix_1.5-1 fastmap_1.1.0 gargle_1.2.1 cli_3.4.1
[31] later_1.3.0 htmltools_0.5.3 prettyunits_1.1.1 tools_4.1.1 igraph_1.3.5 gtable_0.3.1
[37] glue_1.6.2 reshape2_1.4.4 cellranger_1.1.0 vctrs_0.5.1 nlme_3.1-160 crosstalk_1.2.0
[43] ps_1.7.2 lme4_1.1-31 rvest_1.0.3 timechange_0.1.1 mime_0.12 miniUI_0.1.1.1
[49] lifecycle_1.0.3 gtools_3.9.3 googlesheets4_1.0.1 MASS_7.3-58.1 zoo_1.8-11 scales_1.2.1
[55] colourpicker_1.2.0 hms_1.1.2 promises_1.2.0.1 parallel_4.1.1 inline_0.3.19 shinystan_2.6.0
[61] gamm4_0.2-6 gridExtra_2.3 loo_2.5.1 StanHeaders_2.21.0-7 stringi_1.7.8 dygraphs_1.1.1.6
[67] boot_1.3-28.1 pkgbuild_1.4.0 rlang_1.0.6 pkgconfig_2.0.3 matrixStats_0.63.0 lattice_0.20-45
[73] rstantools_2.2.0 htmlwidgets_1.5.4 processx_3.8.0 tidyselect_1.2.0 plyr_1.8.8 magrittr_2.0.3
[79] R6_2.5.1 generics_0.1.3 DBI_1.1.3 mgcv_1.8-41 pillar_1.8.1 haven_2.5.1
[85] withr_2.5.0 xts_0.12.2 survival_3.4-0 modelr_0.1.10 crayon_1.5.2 utf8_1.2.2
[91] tzdb_0.3.0 grid_4.1.1 readxl_1.4.1 callr_3.7.3 threejs_0.3.3 reprex_2.0.2
[97] digest_0.6.30 xtable_1.8-4 httpuv_1.6.6 RcppParallel_5.1.5 stats4_4.1.1 munsell_0.5.0
[103] shinyjs_2.1.0
Paper referred to:
Pavone, F., Piironen, J., Bürkner, P.-C., & Vehtari, A. (2022). Using reference models in variable selection. Computational Statistics. Using reference models in variable selection | SpringerLink
Github repo file referred to: ref-approach-paper/bodyfat_notebook.R at master · fpavone/ref-approach-paper · GitHub