Problem with unserialize with reasonably large model


#1

Hi all -

I am running a large IRT model in RStan (approx 100K parameters with 2m rows in the response) and I keep getting this error that I have not received before while using Stan:

Error in unserialize(socklist[[n]]) : error reading from connection

The model appears to run fine, 1500 iterations, and the error seems to pop up near the end of the chains. Then I only get an empty object in R instead of the full stan object. But the console output shows that the chains all finished.

I should mention too that I have run other large IRT models (maybe not quite this large) without a problem on the same machine, so I am at a bit of a loss and not sure how to try and diagnose the issue.

I am running on Rstan version 2.16.2 and R version 3.4.1 in Mac OS Sierra. I have rebooted the computer and also tried it on a different Mac OS machine with same result.

Is this some kind of limitation in Rstan’s ability to read in the final posterior draws?


#2

You may just be running out of RAM. Try it with cores = 1.


#3

Yes that seems to have been the issue. Thanks much for your help!


#4

Hey @bgoodri,

I just got a different CSV error:

Error in read_one_stan_csv(attr(vbres, “args”)$sample_file) :
‘csvfile’ does not exist on the disk

I’m not sure if this is also a RAM issue because I had been running this model without any issues. Rstan version 2.17.3. The model is quite big (>100K parameters) and I am fitting it with vb.

Any ideas? Thanks for your help!


#5

It should exist. Sometimes overly aggressive servers automatically delete files that they think are not in use. But is there a file in tempdir() that has the name you specified for sample_file?


#6

Thanks for the tip, looked in tempdir() but nothing with a .csv suffix. I looked at other R tmp folders and can’t find anything else there either.

It’s running on a desktop machine, so it shouldn’t have been a server. I do have concurrent R sessions running: could one of them have written over the temp file?


#7

Different R sessions should have different temporary directories. Try this:

  1. Start a new R session
  2. Look at what tempdir() is
  3. Start a Stan model that takes at least a minute to run but specify sample_file
  4. In Windows Explorer or whatever, look if a CSV file is created in that temporary directory while it is running