Quantization in estimation/storage/posterior manipulation


I’m throwing this, but it may not make sense at all for the core devs.

Would quantization (i.e. decreased floating point precision) make sense in Stan at estimation, storage, or posterior generation time? Would it improve speed and storage requirements?

I’m afraid that lower precision may be problematic for the HWN steps, with incremental error, but usually the sampler doesn’t do many steps, no?

For posterior manipulation (e.g. summarisation, diagnosis, visualization, etc), especially when one has many parameters, it may bring speed improvements at virtually no cost.

Where I’m wrong?

I think this was more or less tested when GPU calculations were introduced and I think the result was that double accuracy is needed.

Not sure how many decimals CmdStan saves by default, but probably going to binary output will have impact for IO.

1 Like

What you mean with with “binary output”?

Btw, I also suspect that there could be an impact during sampling, but I cannot see how it could greatly affect model storage and summarization. Maybe also things like the loo computation could be sped up?