Looks like there’s a new json library that’s far faster than rapidjson, https://github.com/simdjson/simdjson It’s reportedly 8.6 times faster than rapidjson.
I wonder how much slower this is than using binary serialization. In switching to protobuf, httpstan saw a ~30% increase in speed over ujson. But perhaps the speedup would have been marginal had we been using simdjson. Impressive!
Related to previous discussion here: Notes on Stan Output Serialization Options (YAML, Protobuf, Avro, CBOR)
Yep, simdjson is lightning fast and I looked at it when working on cmdstan JSON parser change.
However, its currently not suitable for cmdstan due to the C++17 requirement.
That makes sense.
I probably should have used “deserialization” in the title. I was thinking about how simdjson might change some of our thinking about the suitability of JSON as a storage format for output. That is, if simdjson makes reading large fits really fast, we would have less pressure to consider formats like HDF or Apache Arrow, etc.
There’s also size. JSON files will be more than twice as big if they’re at 16-digit precision compared to binary (there’s also the