string vs. protobuf is a false trade-off. Some things about C++ are hard but allowing plug-ins (of the kind we currently have with writers) for handling details of output format is easy in C++, why force the decision?
The size-on-disk is not the barrier to using ASCII, it’s the time it takes to load it.
rstan, for large enough output and for deeply nested arrays must use the .csv output due to memory limits and due to issues caused in R/Rcpp by deeply nested
std::vector<std::vector<...>>. Reading .csv (or any other text-based format, .csv is fairly amenable to optimization) is a bottleneck, as you can see in the discussion I linked.
My point was that whatever way you do it, it’s not like there’s massive complexity in our output types that interfaces have to handle. It’s the routing that’s harder to deal with.
This hasn’t been articulated outside of conversations that happened off-discourse. Your proposal sounds like you want to stringify deep in
stan-dev/stan, in the same fashion that our mass matrices are currently stringified. That seems like a big cost since it’s one of the reasons there’s no clean code for extracting the mass matrix in rstan.
This discussion is going to be a lot more productive if you can be specific about what “complete decoupling” is. In your proposal I see three features: 1) text tags; 2) stringify everything early (and therefore disallow the plug-in approach to output type handling; and 3) use a static object to handle all output. So, what do you actually want these design choices to accomplish?