@Matthijs and I were just talking about Edward’s performance numbers in the Deep Probabilistic Programming paper and I was wondering if we had some response to it? Do we agree that it is faster if run 1 CPU vs 1 CPU for each? I know know there are some issues raised with the comparison but I haven’t seen them published anywhere:
- NUTS is not implemented, though apparently Matt Hoffman is working on that for Edward now. From what I understand that means Edward has pretty much no practical statistical modeling usefulness until that is released.
- Edward is single-precision (because Tensorflow is single-precision), which could effect HMC’s stability and usefulness (but how much?).
- Stan now has specialized GLM functions that provide speedups that likely bring it within range (or surpassing) Edward for these types of functions, though in some sense this is not fair it was maybe originally a case of Edward’s choosing a single Edward-friendly benchmark for comparison in that paper, and our coding-to-the-benchmark to catch up might be reasonable in that light. It would be interesting to me to see a model in which neither has been specialized.
- Obviously GPU and MPI stuff coming down the line will allow us to compete with their multi-CPU and GPU benchmarks, but for now I’m just wondering about the single CPU case.
Anyone have further thoughts here? Have we done our own benchmarks?