Thank you everyone! With respect to CPU cache, is eDRAM exhangeable–i.e., does an additional 4MB of eDRAM give you pretty much the same performance improvement as 4MB of L3 cache–or are these apples and oranges?
Just to confirm what people are reporting, Level 3 cache size appears to be a significant differentiator with the speed of Stan sampling, all other things being equal. Substantially “slower” (by clock speed) Xeon processors, say, on an EC2 instance, will cheerfully finish sampling for me, even with hierarchical models on large data sets, well before an i7 processor with a much higher clock speed, even when I limit to actual non-logical cores.
What makes you sure it’s the L3 cache not the L2? Most Xeons have 1MB L2, i7 256 KB per core.
Fair point. Let’s narrow it down to “cache really matters” and allow more specific benchmarks sort out the credit.