I have two brms
models fit to the same data and would like to compare them using LOOIC. Each has 10,000 posterior samples. When I run them through loo()
, I get the following output:
Full Output
> loo(model_a, model_b)
Output of model 'model_a':
Computed from 10000 by 9941 log-likelihood matrix
Estimate SE
elpd_loo -905.4 50.2
p_loo 132.7 9.5
looic 1810.9 100.4
------
Monte Carlo SE of elpd_loo is 0.2.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 9898 99.6% 1281
(0.5, 0.7] (ok) 43 0.4% 841
(0.7, 1] (bad) 0 0.0% <NA>
(1, Inf) (very bad) 0 0.0% <NA>
All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.
Output of model 'model_b':
Computed from 10000 by 9941 log-likelihood matrix
Estimate SE
elpd_loo -911.0 50.7
p_loo 146.2 10.4
looic 1822.0 101.4
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 9890 99.5% 1599
(0.5, 0.7] (ok) 50 0.5% 512
(0.7, 1] (bad) 1 0.0% 8774
(1, Inf) (very bad) 0 0.0% <NA>
See help('pareto-k-diagnostic') for details.
Model comparisons:
elpd_diff se_diff
model_a 0.0 0.0
model_b -5.6 3.2
Warning message:
Found 1 observations with a pareto_k > 0.7 in model ‘model_b’. It is recommended to set ‘moment_match = TRUE’ in order to perform moment matching for problematic observations.
So it looks like there is 1 problematic observation in model_b
that would benefit from moment matching. However, I have not been able to get this to work. As below, it seems that it would take too much memory. I’m not sure if this is an error or if my models have so many parameters that it truly would require 14 TB to calculate.
> loo(model_a, model_b, moment_match = TRUE)
Error: cannot allocate vector of size 14818.5 Gb
Assuming the latter, I did some searching and found that it may be possible to use fewer samples and therefore save memory. However, I get a different error when trying this:
> loo(model_a, model_b, moment_match = TRUE, nsamples = 5000)
Error in get("storage", envir = as.environment(x)) :
object 'storage' not found
Similarly, I get the same error when trying pointwise calculation, which the documentation mentioned could alleviate memory problems.
> loo(model_a, model_b, moment_match = TRUE, pointwise = TRUE)
Error in get("storage", envir = as.environment(x)) :
object 'storage' not found
Any tips would be appreciated. Does this seem like a bug? Am I doing something wrong? Is there an alternative approach to comparing these models (e.g., I saw but have not had time to read Using Leave-one-out cross-validation for large data • loo)?
Traceback
46: Module(module, mustStart = TRUE)
45: .getModulePointer(x)
44: new("Module", .xData = <environment>)$stan_fit4model67dc63a04f5a_9d01fda9837a44ea20e1686fadfb580e
43: new("Module", .xData = <environment>)$stan_fit4model67dc63a04f5a_9d01fda9837a44ea20e1686fadfb580e
42: eval(call("$", mod, paste("stan_fit4", model_cppname, sep = "")))
41: eval(call("$", mod, paste("stan_fit4", model_cppname, sep = "")))
40: object@mk_cppmodule(object)
39: .local(object, ...)
38: .fun(object = .x1, data = .x2, iter = .x3, seed = .x4, init = .x5,
pars = .x6, include = .x7, warmup = .x8, thin = .x9, control = .x10,
show_messages = .x11, chains = .x12, cores = .x13)
37: .fun(object = .x1, data = .x2, iter = .x3, seed = .x4, init = .x5,
pars = .x6, include = .x7, warmup = .x8, thin = .x9, control = .x10,
show_messages = .x11, chains = .x12, cores = .x13)
36: eval(expr, envir, ...)
35: eval(expr, envir, ...)
34: eval2(call, envir = args, enclos = parent.frame())
33: do_call(rstan::sampling, args)
32: .fit_model(model, ...)
31: .fun(model = .x1, sdata = .x2, algorithm = .x3, backend = .x4,
iter = .x5, warmup = .x6, thin = .x7, chains = .x8, cores = .x9,
inits = .x10, exclude = .x11, control = .x12, future = .x13,
seed = .x14, silent = .x15)
30: eval(expr, envir, ...)
29: eval(expr, envir, ...)
28: eval2(call, envir = args, enclos = parent.frame())
27: do_call(fit_model, fit_args)
26: brm(fit = x, chains = 0)
25: withCallingHandlers(expr, message = function(c) if (inherits(c,
classes)) tryInvokeRestart("muffleMessage"))
24: suppressMessages(brm(fit = x, chains = 0))
23: loo_moment_match.brmsfit(x = .x1, loo = .x2, newdata = .x3, resp = .x4,
k_threshold = .x5, check = .x6)
22: loo_moment_match(x = .x1, loo = .x2, newdata = .x3, resp = .x4,
k_threshold = .x5, check = .x6)
21: eval(expr, envir, ...)
20: eval(expr, envir, ...)
19: eval2(call, envir = args, enclos = parent.frame())
18: do_call("loo_moment_match", moment_match_args)
17: .loo(x = .x1, newdata = .x2, resp = .x3, model_name = .x4, pointwise = .x5,
k_threshold = .x6, moment_match = .x7, reloo = .x8, moment_match_args = .x9,
reloo_args = .x10)
16: eval(expr, envir, ...)
15: eval(expr, envir, ...)
14: eval2(call, envir = args, enclos = parent.frame())
13: do_call(paste0(".", criterion), args)
12: .fun(criterion = .x1, pointwise = .x2, resp = .x3, k_threshold = .x4,
moment_match = .x5, reloo = .x6, moment_match_args = .x7,
reloo_args = .x8, x = .x9, model_name = .x10, use_stored = .x11)
11: eval(expr, envir, ...)
10: eval(expr, envir, ...)
9: eval2(call, envir = args, enclos = parent.frame())
8: do_call(compute_loo, args)
7: .fun(models = .x1, criterion = .x2, pointwise = .x3, compare = .x4,
resp = .x5, k_threshold = .x6, moment_match = .x7, reloo = .x8,
moment_match_args = .x9, reloo_args = .x10)
6: eval(expr, envir, ...)
5: eval(expr, envir, ...)
4: eval2(call, envir = args, enclos = parent.frame())
3: do_call(compute_loolist, args)
2: loo.brmsfit(model_a, model_b, moment_match = TRUE)
1: loo(model_a, model_b, moment_match = TRUE)
Session Info
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] brms_2.13.5 Rcpp_1.0.5
loaded via a namespace (and not attached):
[1] Brobdingnag_1.2-6 jsonlite_1.7.0 gtools_3.8.2
[4] StanHeaders_2.21.0-5 RcppParallel_5.0.2 threejs_0.3.3
[7] shiny_1.5.0 assertthat_0.2.1 stats4_4.0.2
[10] backports_1.1.8 pillar_1.4.6 lattice_0.20-41
[13] glue_1.4.1 digest_0.6.25 promises_1.1.1
[16] colorspace_1.4-1 htmltools_0.5.0 httpuv_1.5.4
[19] Matrix_1.2-18 plyr_1.8.6 dygraphs_1.1.1.6
[22] pkgconfig_2.0.3 rstan_2.21.2 purrr_0.3.4
[25] xtable_1.8-4 mvtnorm_1.1-1 scales_1.1.1
[28] processx_3.4.3 later_1.1.0.1 tibble_3.0.3
[31] bayesplot_1.7.2 generics_0.0.2 ggplot2_3.3.2
[34] ellipsis_0.3.1 DT_0.15 withr_2.2.0
[37] shinyjs_1.1 cli_2.0.2 magrittr_1.5
[40] crayon_1.3.4 mime_0.9 ps_1.3.4
[43] fansi_0.4.1 nlme_3.1-148 xts_0.12-0
[46] pkgbuild_1.1.0 colourpicker_1.0 prettyunits_1.1.1
[49] rsconnect_0.8.16 tools_4.0.2 loo_2.3.1
[52] lifecycle_0.2.0 matrixStats_0.56.0 stringr_1.4.0
[55] V8_3.2.0 munsell_0.5.0 callr_3.4.3
[58] packrat_0.5.0 compiler_4.0.2 rlang_0.4.7
[61] grid_4.0.2 ggridges_0.5.2 rstudioapi_0.11
[64] htmlwidgets_1.5.1 crosstalk_1.1.0.1 igraph_1.2.5
[67] miniUI_0.1.1.1 base64enc_0.1-3 codetools_0.2-16
[70] gtable_0.3.0 curl_4.3 inline_0.3.16
[73] abind_1.4-5 markdown_1.1 reshape2_1.4.4
[76] R6_2.4.1 gridExtra_2.3 rstantools_2.1.1
[79] zoo_1.8-8 knitr_1.29 bridgesampling_1.0-0
[82] dplyr_1.0.1 fastmap_1.0.1 shinystan_2.5.0
[85] shinythemes_1.1.2 stringi_1.4.6 parallel_4.0.2
[88] vctrs_0.3.2 tidyselect_1.1.0 xfun_0.16
[91] coda_0.19-3