I am working on creating a reproducible example, but in the meantime thought I would ask this now in case there is a known solution. Also, this is my first forum posting, I have tried to format correctly using code chunks etc., but I apologize if it’s not correct!
My issue is that when I compare candidate models using the
loo_model_weights() function I receive inconsistent model weights.
I realized this issue while responding to some comments for a manuscript and running some alternative models using brm(). I ran the original analysis on an old computer. Before fitting the new models, I re-fit the original models on my new machine as a start. There were six total models, a “global” model and a suite of simplified models (removing predictor variables from the global model). I fit the global model, and then used the update() function to sequentially remove predictor variables.
mod1<- brm(data = d, y ~ x1 + x2 + x3 + x4 + x5 + (1 | siteID) + (1 | year), family = gaussian(), prior =c( prior(normal(0,0.25), class = "b"), prior(normal(-1.5, 1), class = "Intercept"), prior(exponential(2), class="sd"), prior(exponential(2), class="sigma")), chains = 4, sample_prior = FALSE, iter = 6000, cores = 4, control = list(adapt_delta = 0.99)) mod2 <- update(mod1, formula. = . ~ . -x2 -x3, cores = 4)
For each model, I run a loo analysis using
loo_1 <- loo(mod1, reloo = TRUE, seed = TRUE, cores = 6) loo_2 <- loo(mod1, reloo = TRUE, seed = TRUE, cores = 6)
seed = TRUE was a solution I found previously after the
futures package was updated. This was approximately 6 months ago, and I was unable to re-find the forum post where I got that solution. As a sidebar, it seems like this is may be not needed anymore? I ran
loo() with and without it, and my main problem remains the same, but thought I would mention it here.
Then to compare them, I use loo_model_weights() as such:
loo_model_weights( list(model1 = loo_1, model2 = loo_2, model3 = loo_3, model4 = loo_4, model5 = loo_5, model6 = loo_6))
When I reran this code on my new computer, the function returned a different set of model weights compared to what I reported in the manuscript. My previous “best” model still received a majority of the weight, but the other models now received different relative weights.
I have re-fit the models and re-run the loo analysis and every time I get a different set of relative model weights. At first I thought the “best” model was the same in every run. However, after running it 5 or 6 times now, the relative model weights seem to vary widely.
I’m not sure if this is a new issue or if this was the case when I ran them originally ~6 months ago. When I first performed the analysis, I reported the weights as returned, and did not try to re-run them, assuming they would be consistent.
I understand that re-fitting the models could produce some variance in the parameter estimates through random sampling, so a little variation in the estimated model weights could be expected.
I’m just curious if this is a bug, if my packages are incompatible, or some other software is incorrect? Perhaps my models or analysis is not specified correctly? Or perhaps I’m way off the mark and the loo_model_weights are not expected to return consistent results?
Here is my session info:
> sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042) Matrix products: default locale:  LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252  LC_MONETARY=English_United States.1252 LC_NUMERIC=C  LC_TIME=English_United States.1252 attached base packages:  stats graphics grDevices utils datasets methods base other attached packages:  viridis_0.6.0 viridisLite_0.4.0 tidybayes_2.3.1 brms_2.15.0  Rcpp_1.0.6 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.5  purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.1  ggplot2_3.3.3 tidyverse_1.3.1 loaded via a namespace (and not attached):  minqa_1.2.4 colorspace_2.0-0 ellipsis_0.3.2  ggridges_0.5.3 rsconnect_0.8.17 markdown_1.1  base64enc_0.1-3 fs_1.5.0 rstudioapi_0.13  listenv_0.8.0 farver_2.1.0 rstan_2.26.1  svUnit_1.0.6 DT_0.18 fansi_0.4.2  mvtnorm_1.1-1 lubridate_1.7.10 xml2_1.3.2  codetools_0.2-18 bridgesampling_1.1-2 splines_4.0.3  shinythemes_1.2.0 bayesplot_1.8.0 projpred_2.0.2  jsonlite_1.7.2 nloptr_184.108.40.206 broom_0.7.6  dbplyr_2.1.1 ggdist_2.4.0 shiny_1.6.0  compiler_4.0.3 httr_1.4.2 backports_1.2.1  assertthat_0.2.1 Matrix_1.2-18 fastmap_1.1.0  cli_2.5.0 later_1.2.0 prettyunits_1.1.1  htmltools_0.5.1.1 tools_4.0.3 igraph_1.2.6  coda_0.19-4 gtable_0.3.0 glue_1.4.2  reshape2_1.4.4 V8_3.4.2 cellranger_1.1.0  vctrs_0.3.8 nlme_3.1-152 crosstalk_1.1.1  globals_0.14.0 ps_1.6.0 lme4_1.1-26  rvest_1.0.0 mime_0.10 miniUI_0.1.1.1  lifecycle_1.0.0 gtools_3.8.2 statmod_1.4.35  future_1.21.0 MASS_7.3-53.1 zoo_1.8-9  scales_1.1.1 colourpicker_1.1.0 hms_1.0.0  promises_220.127.116.11 Brobdingnag_1.2-6 parallel_4.0.3  inline_0.3.17 shinystan_2.5.0 curl_4.3.1  gamm4_0.2-6 gridExtra_2.3 StanHeaders_2.26.1  loo_2.4.1 stringi_1.5.3 dygraphs_18.104.22.168  checkmate_2.0.0 boot_1.3-28 pkgbuild_1.2.0  rlang_0.4.11 pkgconfig_2.0.3 matrixStats_0.58.0  distributional_0.2.2 lattice_0.20-44 rstantools_2.1.1  htmlwidgets_1.5.3 processx_3.5.2 tidyselect_1.1.1  parallelly_1.25.0 plyr_1.8.6 magrittr_2.0.1  R6_2.5.0 generics_0.1.0 DBI_1.1.1  pillar_1.6.0 haven_2.4.1 withr_2.4.2  mgcv_1.8-35 xts_0.12.1 abind_1.4-5  modelr_0.1.8 crayon_1.4.1 arrayhelpers_1.1-0  utf8_1.2.1 grid_4.0.3 readxl_1.3.1  callr_3.7.0 threejs_0.3.3 reprex_2.0.0  digest_0.6.27 xtable_1.8-4 httpuv_1.6.0  RcppParallel_5.1.2 stats4_4.0.3 munsell_0.5.0  shinyjs_2.0.0