Error triggered by changing sample size?

dmalison · September 3, 2017, 8:56pm

I’m attempting to estimate a in RStan on simulated data. The program is several hundred lines, so I’m not including it for brevity. I’m able to obtain posterior draws from 10 parallel chains (500 warmup iterations, 1000 total iterations) whenever I set the number of simulated observations to 1,000. However, when I increase the number of simulated observations to 10,000, I receive the following error:

Error in FUN(X[[i]], ...) : 
  trying to get slot "mode" from an object of a basic class ("NULL") with no slots
Calls: stan ... sampling -> sampling -> .local -> sapply -> lapply -> FUN
Execution halted

The error is triggered after 6 chains complete. However, the other 4 chains appear to never start. Any idea why changing the sample size would trigger this behavior?

sakrejda · September 3, 2017, 9:22pm

Do you know how much RAM it consumed? Sounds like you just ran out.

dmalison · September 3, 2017, 9:38pm

Hi sakrejda, how might I check?

bbbales2 · September 4, 2017, 12:22am

Anything defined in parameters, transformed parameters, or generated quantities gets stored on every output iteration.

Any int is 4 bytes and real is 8 bytes. If you have 3 parameters and 1 generated quantity that is an integer, then that’s (3 * 8 + 4) * N_post_warmup_iters * N_chains bytes you’ll need to allocate.

Or something like that. It also sounds like it happens halfway through something, so you could just open up your computer’s system monitor or whatever and watch the memory usage.

Check the ‘pars’ parameter in the Rstan manual (edit: of the stan function). I think it lets you save only the things you want (if you have a lot of stuff).

Bob_Carpenter · September 5, 2017, 11:05am

RStan uses a lot of memory (presumably due to the vagaries of R that we can’t fix given how much this comes up).

You can use CmdStan to stream data out.

@bgoodri or @jonah (not @jgabry—this multiple identities on different systems is confusing): Is there a mode in R that only streams to a file rather than storing all the draws in memory? If so, would writing everything to file, then reading back in require less overall memory? If so, some standing instructions on how to scale R would be great. (If it exists, a pointer I could find would be great!)

sakrejda · September 5, 2017, 1:02pm

We could totally fix this if we changed to just streaming output. R has some problems but the c++ interface is very flexible… after the next round of services changes it’ll be worth a shot.

jonah · September 5, 2017, 5:28pm

I think the sample_file argument does this, but I agree that we could benefit from a guide for doing this.

I didn’t even realize I’m jonah here and jgabry on GitHub. I should have used jgabry here too.

Bob_Carpenter · September 10, 2017, 10:35am

Absolutely.

I don’t see why it’d need to wait other than to avoid coding things multiple times. I didn’t know there were any services changes in the works—I know there’s been a flurry of discussion, but I didn’t know there were any concrete plans. Is there a wiki page or something with a design document?

Does it not also store the draws in memory? So the return is no longer a fit object with draws in it?

sakrejda · September 10, 2017, 11:42am

You’re right, it doesn’t have to wait. Actually I’m r there’s the memmap package that should let us write to for transparently, if it works well.

Bob_Carpenter · September 10, 2017, 11:49am

Just remember that as you add dependencies, maintenance costs grow at least quadratically. Here’s how I tried to explain it on Andrew’s blog:

sakrejda · September 10, 2017, 12:09pm

Can’t argue there.

jpritikin · May 18, 2018, 12:06pm

I might have run into the same problem. A work-around is to use sample_file=tempdir()? It’s that simple?

Topic		Replies	Views
Rstan error during sampling for large dataset General rstan , optimization	1	636	May 5, 2021
Unexpected error when running Rstan in Windows computer Modeling	5	459	April 20, 2022
Strange error running rstan RStan	17	3593	January 3, 2018
Memory error when running stan General	6	890	April 22, 2021
Rstan vector size error RStan fitting-issues	1	361	June 8, 2020

Error triggered by changing sample size?

Related topics