Warning to others: compilation errors and memory/cache corruption

I’ve been getting a situation where after a while of trying to adjust and debug my model, eventually I make a change to the model and RStan crashes during compilation, I think typically it’s inside g++. once this occurs there is absolutely no way to compile further models until I reboot…

well that is until I discovered that I could invalidate the filesystem cache

echo 3 > /proc/sys/vm/drop_caches

which immediately fixes the problem, probably because the compiler binary is then re-read from disk.

Now, I recognize this is probably a hardware or driver or kernel issue on my machine, but basically the fact is that running a big model in Stan with lots of data is a hard-core computation that stresses reliability of the machine (heat, cache, memory, paging, whatever).

My next machine will probably have ECC RAM, and I’m going to go look for bios update on my motherboard, and soforth, but just be aware that if for some reason you get suddenly an inability to compile your stan model, it might be because of some similar issue, rather than a bug in Stan. I don’t know how many times I thought I should report a bug only to have a reboot mysteriously fix the problem.

1 Like

Thanks for the tip. It won’t let me mark your question as its own answer, though I tried.

We usually recommend using CmdStan for large runs. It doesn’t stress memory nearly as much because draws are streamed to a file. And there are many fewer moving pieces without R and RStan and Rcpp in the picture. You can then take the results and read them back into R for analysis with only the memory overhead required to store the draws.

Thanks Bob. I’ll see about doing that. It’d be useful to have an RStan function that packaged up your data and called CmdStan, streaming the stderr to a file, and returned immediately with a message like “check back later for the final results. read them in with read_stan_csv etc”… something like first saving your data to an input file, then calling system(“cmdstan inputfiles > ./cmdstan.out &”) and then printing the message.

I agree that for big runs, RCpp and all of it adds possible bugs/stress in the whole pipeline

:-) That was how the first R interface I wrote worked. I’m not sure we’ve really benefitted from moving to an in-memory representation, becuase the file I/O turns out not to be a bottelneck.

1 Like