Loo() error for brms model compiled on different computer

Hi,

I am trying to run the following line with loo() to compare two models which I fitted on a windows computer:

loo_percdep <- loo(perc_mod_BB, percdep_mod_BB, cores=5, reloo=TRUE)

LOO is not working on that windows platform (for which I consistently receive the error message Error in serialize(data, node$con) : error writing to connection) . I have looked into all the topics on similar issues and I have tried the solutions suggested there (e.g., getting rid of -march native etc), which unfortunately did not work for me. So I tried to load the models on my Mac and run the same command there. Loo() seems to work fine on my Mac (with Catalina), but I get the following error message after loo first says that there are no problematic observations and that the original ‘loo’ object will be returned:

Fitting model 1 out of 1 (leaving out observation 2159)
Start sampling
Error in prep_call_sampler(object) :
the compiled object from C++ code for this model is invalid, possible reasons:

  • compiled with save_dso=FALSE;
  • compiled on a different platform;
  • does not exist (created from reading csv files).

I suspect that the issue is that I have compiled the model on a windows system and am now trying to run it on a Mac? I have tried to run the same command with save_dso=TRUE (didn’t change anything), but I might be wrong with specifying that in the loo command instead of in the models I am trying to compare. My question is: does the problem indeed lie in the fact that the models I am trying to compare were compiled on a windows system? And if so, what is the most logical step to try to work around/fix this problem?

Thanks so much in advance.

Here is my sessionInfo() from my Mac where I get the error
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.15

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] loo_2.2.0 brms_2.11.4 Rcpp_1.0.3 rstan_2.19.2 ggplot2_3.2.1 StanHeaders_2.21.0-1

loaded via a namespace (and not attached):
[1] Brobdingnag_1.2-6 gtools_3.8.1 threejs_0.3.3 shiny_1.4.0 assertthat_0.2.1 stats4_3.6.0
[7] remotes_2.1.0 globals_0.12.5 pillar_1.4.3 backports_1.1.5 lattice_0.20-38 glue_1.3.1
[13] digest_0.6.23 promises_1.1.0 colorspace_1.4-1 htmltools_0.4.0 httpuv_1.5.2 Matrix_1.2-17
[19] plyr_1.8.5 dygraphs_1.1.1.6 pkgconfig_2.0.3 listenv_0.8.0 purrr_0.3.3 xtable_1.8-4
[25] scales_1.1.0 processx_3.4.1 later_1.0.0 tibble_2.1.3 bayesplot_1.7.1 DT_0.11
[31] withr_2.1.2 shinyjs_1.1 lazyeval_0.2.2 cli_2.0.1 magrittr_1.5 crayon_1.3.4
[37] mime_0.8 ps_1.3.0 future_1.16.0 nlme_3.1-139 xts_0.12-0 pkgbuild_1.0.6
[43] colourpicker_1.0 rsconnect_0.8.16 tools_3.6.0 prettyunits_1.1.1 matrixStats_0.55.0 stringr_1.4.0
[49] munsell_0.5.0 callr_3.4.1 compiler_3.6.0 rlang_0.4.3 grid_3.6.0 ggridges_0.5.2
[55] rstudioapi_0.10 htmlwidgets_1.5.1 crosstalk_1.0.0 igraph_1.2.4.2 miniUI_0.1.1.1 base64enc_0.1-3
[61] codetools_0.2-16 gtable_0.3.0 curl_4.3 inline_0.3.15 abind_1.4-5 markdown_1.1
[67] reshape2_1.4.3 R6_2.4.1 gridExtra_2.3 rstantools_2.0.0 zoo_1.8-7 bridgesampling_0.8-1
[73] dplyr_0.8.3 fastmap_1.0.1 rprojroot_1.3-2 shinystan_2.5.0 shinythemes_1.1.2 stringi_1.4.5
[79] parallel_3.6.0 tidyselect_0.2.5 coda_0.19-3

This is the sessionInfo() from the windows computer where I got the error writing to connection error:

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_Netherlands.1252 LC_CTYPE=English_Netherlands.1252
[3] LC_MONETARY=English_Netherlands.1252 LC_NUMERIC=C
[5] LC_TIME=English_Netherlands.1252

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] loo_2.1.0 gdata_2.18.0 foreign_0.8-71 brms_2.10.0 Rcpp_1.0.2
[6] rstan_2.19.2 ggplot2_3.2.1 StanHeaders_2.19.0

loaded via a namespace (and not attached):
[1] Brobdingnag_1.2-6 pkgload_1.0.2 gtools_3.8.1 threejs_0.3.1 shiny_1.4.0
[6] assertthat_0.2.1 stats4_3.6.1 remotes_2.1.0 sessioninfo_1.1.1 pillar_1.4.2
[11] backports_1.1.5 lattice_0.20-38 glue_1.3.1 digest_0.6.22 promises_1.1.0
[16] colorspace_1.4-1 htmltools_0.4.0 httpuv_1.5.2 Matrix_1.2-17 plyr_1.8.4
[21] devtools_2.2.1 dygraphs_1.1.1.6 pkgconfig_2.0.3 purrr_0.3.3 xtable_1.8-4
[26] scales_1.0.0 processx_3.4.1 later_1.0.0 tibble_2.1.3 bayesplot_1.7.0
[31] usethis_1.5.1 ellipsis_0.3.0 DT_0.9 withr_2.1.2 shinyjs_1.0
[36] lazyeval_0.2.2 cli_1.1.0 magrittr_1.5 crayon_1.3.4 mime_0.7
[41] memoise_1.1.0 ps_1.3.0 fs_1.3.1 nlme_3.1-140 xts_0.11-2
[46] pkgbuild_1.0.6 colourpicker_1.0 rsconnect_0.8.15 tools_3.6.1 prettyunits_1.0.2
[51] matrixStats_0.55.0 stringr_1.4.0 munsell_0.5.0 callr_3.3.2 compiler_3.6.1
[56] rlang_0.4.1 grid_3.6.1 ggridges_0.5.1 rstudioapi_0.10 htmlwidgets_1.5.1
[61] crosstalk_1.0.0 igraph_1.2.4.1 miniUI_0.1.1.1 base64enc_0.1-3 testthat_2.3.0
[66] codetools_0.2-16 gtable_0.3.0 inline_0.3.15 abind_1.4-5 markdown_1.1
[71] reshape2_1.4.3 R6_2.4.0 gridExtra_2.3 rstantools_2.0.0 zoo_1.8-6
[76] bridgesampling_0.7-2 dplyr_0.8.3 fastmap_1.0.1 rprojroot_1.3-2 shinystan_2.5.0
[81] shinythemes_1.1.2 desc_1.2.0 stringi_1.4.3 tidyselect_0.2.5 coda_0.19-3

Almost no calculations involving Stan are comparable across computers, especially if they have different operating systems. And in a lot of cases, it won’t even let you try. You should calculate loo on the same computer you compiled the model on, as soon as the MCMC is finished. In principle, loo should work for a model that is reloaded in a different session on the same computer. But in this case, observation 2159 has a Pareto k that is too high, so it attempts to refit the model (because reloo = TRUE) and the compiled model is not available.

It might work if you rerun brms with 1 iteration and then stick the compiled Stan model into the corresponding place of your saved Stan fit.

Thank you for your quick response.
I will retry on the windows computer, but I keep running into the serialise error with writing to connection, which I do not know how to work around.

Try with cores=1. It’s known problem with windows that parallelization tends to be broken with not very informative error message.

Will do and will report back with the result. Thank you!!

Unfortunately, I received the same error with cores=1. I copied the warnings along with the error. I also tried it without reloo=TRUE. The same errors occurred.

loo_percdep ← LOO(perc_mod_BB, percdep_mod_BB, cores=1, reloo=TRUE)
Error in serialize(data, node$con) : error writing to connection
In addition: There were 15 warnings (use warnings() to see them)
Error in serialize(data, node$con) : error writing to connection
warnings()
Warning messages:
1: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 66 (<-DESKTOP-0CRLI75:11737)
2: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 65 (<-DESKTOP-0CRLI75:11737)
3: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 64 (<-DESKTOP-0CRLI75:11737)
4: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 63 (<-DESKTOP-0CRLI75:11737)
5: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 62 (<-DESKTOP-0CRLI75:11737)
6: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 61 (<-DESKTOP-0CRLI75:11737)
7: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 60 (<-DESKTOP-0CRLI75:11737)
8: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 59 (<-DESKTOP-0CRLI75:11737)
9: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 58 (<-DESKTOP-0CRLI75:11737)
10: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 57 (<-DESKTOP-0CRLI75:11737)
11: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 56 (<-DESKTOP-0CRLI75:11737)
12: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 55 (<-DESKTOP-0CRLI75:11737)
13: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 54 (<-DESKTOP-0CRLI75:11737)
14: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 53 (<-DESKTOP-0CRLI75:11737)
15: In .Internal(get0(x, envir, mode, inherits, ifnotfound)) :
closing unused connection 52 (<-DESKTOP-0CRLI75:11737)

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_Netherlands.1252 LC_CTYPE=English_Netherlands.1252
[3] LC_MONETARY=English_Netherlands.1252 LC_NUMERIC=C
[5] LC_TIME=English_Netherlands.1252

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] loo_2.2.0 gdata_2.18.0 foreign_0.8-71 brms_2.11.1
[5] Rcpp_1.0.3 rstan_2.19.2 ggplot2_3.2.1 StanHeaders_2.19.0

loaded via a namespace (and not attached):
[1] Brobdingnag_1.2-6 pkgload_1.0.2 gtools_3.8.1 threejs_0.3.1
[5] shiny_1.4.0 assertthat_0.2.1 stats4_3.6.1 remotes_2.1.0
[9] sessioninfo_1.1.1 pillar_1.4.2 backports_1.1.5 lattice_0.20-38
[13] glue_1.3.1 digest_0.6.22 checkmate_1.9.4 promises_1.1.0
[17] colorspace_1.4-1 htmltools_0.4.0 httpuv_1.5.2 Matrix_1.2-17
[21] plyr_1.8.4 devtools_2.2.1 dygraphs_1.1.1.6 pkgconfig_2.0.3
[25] purrr_0.3.3 xtable_1.8-4 scales_1.0.0 processx_3.4.1
[29] later_1.0.0 tibble_2.1.3 bayesplot_1.7.0 usethis_1.5.1
[33] ellipsis_0.3.0 DT_0.9 withr_2.1.2 shinyjs_1.0
[37] lazyeval_0.2.2 cli_1.1.0 magrittr_1.5 crayon_1.3.4
[41] mime_0.7 memoise_1.1.0 ps_1.3.0 fs_1.3.1
[45] nlme_3.1-140 xts_0.11-2 pkgbuild_1.0.6 colourpicker_1.0
[49] rsconnect_0.8.15 tools_3.6.1 prettyunits_1.0.2 matrixStats_0.55.0
[53] stringr_1.4.0 munsell_0.5.0 callr_3.3.2 compiler_3.6.1
[57] rlang_0.4.1 grid_3.6.1 ggridges_0.5.1 rstudioapi_0.10
[61] htmlwidgets_1.5.1 crosstalk_1.0.0 igraph_1.2.4.1 miniUI_0.1.1.1
[65] base64enc_0.1-3 testthat_2.3.0 gtable_0.3.0 inline_0.3.15
[69] abind_1.4-5 markdown_1.1 reshape2_1.4.3 R6_2.4.0
[73] gridExtra_2.3 rstantools_2.0.0 zoo_1.8-6 bridgesampling_0.7-2
[77] dplyr_0.8.3 fastmap_1.0.1 rprojroot_1.3-2 shinystan_2.5.0
[81] shinythemes_1.1.2 desc_1.2.0 stringi_1.4.3 tidyselect_0.2.5
[85] coda_0.19-3

My models run for a week each, which might be the problem when I want to compare them as I cannot run the comparison loo() command right away. Do you think it would work if I run loo() right away for each separate model and then store it in an object and subsequently put them in the comparison loo() when the second model has finished running?

Yes, that would work.

2 Likes

Great, thanks a lot for your fast reply. Will give that a go!

Unfortunately, I keep on getting the same error, even on the windows machine, also with cores=1, and the same error occurs with waic()

waic_perc_mod ← waic(perc_mod_BB)
Error in serialize(data, node$con) : error writing to connection

Any ideas on how to work around this? Thanks so much in advance.

So, you estimated the model on the Windows machine and immediately called waic without shutting down R and got this error?

yes

I don’t understand why that would happen. Does the error also happen with loo?

yes, the same error happens with loo also with cores=5 and with cores=1

loo_percdepmod ← loo(percdep_mod_BB)
Error in serialize(data, node$con) : error writing to connection

What does traceback() say right after the error comes?

traceback()
24: serialize(data, node$con)
23: sendData.SOCKnode(con, list(type = type, data = value, tag = tag))
22: sendData(con, list(type = type, data = value, tag = tag))
21: postNode(n, “DONE”)
20: stopNode(n)
19: stopCluster.default(cl)
18: parallel::stopCluster(cl)
17: relative_eff.array(x, cores = cores)
16: relative_eff.matrix(x, chain_id = chain_id)
15: loo::relative_eff(x, chain_id = chain_id)
14: r_eff_helper(exp(log_lik), fit = fit, allow_na = allow_na)
13: r_eff_log_lik(loo_args$x, fit = x)
12: .fun(criterion = .x1, pointwise = .x2, resp = .x3, k_threshold = .x4,
reloo = .x5, reloo_args = .x6, x = .x7, model_name = .x8,
use_stored = .x9) at #1
11: eval(expr, envir, …)
10: eval(expr, envir, …)
9: eval2(call, envir = args, enclos = parent.frame())
8: do_call(compute_loo, args)
7: .fun(models = .x1, criterion = .x2, pointwise = .x3, compare = .x4,
resp = .x5, k_threshold = .x6, reloo = .x7, reloo_args = .x8) at #1
6: eval(expr, envir, …)
5: eval(expr, envir, …)
4: eval2(call, envir = args, enclos = parent.frame())
3: do_call(compute_loos, args)
2: loo.brmsfit(percdep_mod_BB)
1: loo(percdep_mod_BB)

OK. That still does not make sense to me. Maybe @jonah or @avehtari knows something.

1 Like

The code line doesn’t set the cores. How do you set the number of cores used?

It’s going parallel, so I assume you still have cores>1 (or there is a bug in relative_eff)

I tried to do this with and without parallel processes.
I ran the code with combinations of with and without options(mc.cores = parallel::detectCores()) and with and without specifying in the code. So I tried for example: loo(model) and loo(model, cores=1), loo(model, cores=5).

However, you may be right in saying that options(mc.cores = parallel::detectCores()) might still have been active in the background?