Limits to JSON conversion for Large Data (R character strings are limited to 2^31-1 bytes)

martinmodrak · September 13, 2021, 8:42am

Hi,
this appears to be a bug/limitation of the current cmdstanr implementation which relies on jsonlite::write_json. Could you try building a small reproducible example (e.g. by simulating a large dataset) and filing an issue at Issues · stan-dev/cmdstanr · GitHub ?

I think the only workaround that does not require code changes to cmdstanr is for you to write the JSON file yourself (you can inspect the format by writing a smaller dataset) in a way that does not require construction of large strings. Then you can call the model executable directly (see e.g. 4 MCMC Sampling | CmdStan User’s Guide) and then use cmdstanr::read_stan_csv or cmdstanr::as_cmdstan_fit to read the results into R.

A similar problem was discussed here: Brms limited memory issue while running on 15M data points (without solution unfortunately). The problem was noted for jsonlite at R, convert large dataset into JSON - Stack Overflow (once again without solution)

Topic		Replies	Views
Request for comments: JSON Sampling output Developers	4	601	February 16, 2020
RStan crashing with large files on AWS Modeling	2	736	May 23, 2019
Most memory efficient stan interface? Interfaces	13	620	December 5, 2023
Unexpected error when running Rstan in Windows computer Modeling	5	456	April 20, 2022
A way to create JSON file for cmdstanpy in python? General cmdstanpy	4	752	August 12, 2021

Limits to JSON conversion for Large Data (R character strings are limited to 2^31-1 bytes)

Related topics