Problem with unserialize with reasonably large model

Hi all -

I am running a large IRT model in RStan (approx 100K parameters with 2m rows in the response) and I keep getting this error that I have not received before while using Stan:

Error in unserialize(socklist[[n]]) : error reading from connection

The model appears to run fine, 1500 iterations, and the error seems to pop up near the end of the chains. Then I only get an empty object in R instead of the full stan object. But the console output shows that the chains all finished.

I should mention too that I have run other large IRT models (maybe not quite this large) without a problem on the same machine, so I am at a bit of a loss and not sure how to try and diagnose the issue.

I am running on Rstan version 2.16.2 and R version 3.4.1 in Mac OS Sierra. I have rebooted the computer and also tried it on a different Mac OS machine with same result.

Is this some kind of limitation in Rstan’s ability to read in the final posterior draws?

You may just be running out of RAM. Try it with cores = 1.

1 Like

Yes that seems to have been the issue. Thanks much for your help!

Hey @bgoodri,

I just got a different CSV error:

Error in read_one_stan_csv(attr(vbres, “args”)$sample_file) :
‘csvfile’ does not exist on the disk

I’m not sure if this is also a RAM issue because I had been running this model without any issues. Rstan version 2.17.3. The model is quite big (>100K parameters) and I am fitting it with vb.

Any ideas? Thanks for your help!

It should exist. Sometimes overly aggressive servers automatically delete files that they think are not in use. But is there a file in tempdir() that has the name you specified for sample_file?

Thanks for the tip, looked in tempdir() but nothing with a .csv suffix. I looked at other R tmp folders and can’t find anything else there either.

It’s running on a desktop machine, so it shouldn’t have been a server. I do have concurrent R sessions running: could one of them have written over the temp file?

Different R sessions should have different temporary directories. Try this:

  1. Start a new R session
  2. Look at what tempdir() is
  3. Start a Stan model that takes at least a minute to run but specify sample_file
  4. In Windows Explorer or whatever, look if a CSV file is created in that temporary directory while it is running

I’m having a similar problem in rstan and wondering if this was resolved. I’m running a large model and a large dataset with vb and I get this this error after the model has converged and it draws a sample of size 1000 from the approximate posterior:
Error in scan(csvfile, what = double(), sep = ",", comment.char = "", : too many items Calls: vb -> vb -> .local -> read_one_stan_csv -> scan

The call I am using to run stan is

vb(m_init, data = stan_d, pars = pars,
init = 0, tol_rel_obj = 0.007,
adapt_engaged = FALSE, iter=100000,
eta = 0.1)

My session info is below. Thanks in advance,
Mikey

R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.4 (Maipo)
Matrix products: default
BLAS: /pfs/tsfs1/apps/el7-x86_64/u/gcc/7.3.0/r/3.5.3-3m5f3ae/rlib/R/lib/libRblas.so
LAPACK: /pfs/tsfs1/apps/el7-x86_64/u/gcc/7.3.0/r/3.5.3-3m5f3ae/rlib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rstan_2.18.2 StanHeaders_2.18.1 ggplot2_3.1.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 magrittr_1.5 tidyselect_0.2.5 munsell_0.5.0
[5] colorspace_1.4-0 R6_2.4.0 rlang_0.3.1 plyr_1.8.4
[9] dplyr_0.8.0.1 parallel_3.5.3 pkgbuild_1.0.2 grid_3.5.3
[13] gtable_0.2.0 loo_2.0.0 cli_1.0.1 withr_2.1.2
[17] matrixStats_0.54.0 lazyeval_0.2.1 assertthat_0.2.0 tibble_2.0.1
[21] crayon_1.3.4 processx_3.2.1 gridExtra_2.3 purrr_0.3.0
[25] callr_3.1.1 ps_1.3.0 inline_0.3.15 glue_1.3.0
[29] compiler_3.5.3 pillar_1.3.1 prettyunits_1.0.2 scales_1.0.0
[33] stats4_3.5.3 pkgconfig_2.0.2

It is too big to be read in as a CSV file.

Thank you @bgoodri. Is there a solution to this beyond reducing my dataset size or reducing the size of the model?

You could reduce the number of unknowns written to the CSV file by putting more of what you don’t need in the model block rather than transformed parameters.

Thank you for the tip @bgoodri! I’ll give this a try.

Moving parameters to the model block helped. Another simple solution is to draw a smaller sample from the posterior by setting output_samples=100 instead of the default, which is 1000. This gives a much smaller approximate posterior, but it is still useful for what I’m doing, and the file containing the posteriors is 10% of the size.