I’m trying to get my head around Stan performance on an Intel 12th Gen CPU (i7-12700H).
These CPUs have some faster Performance cores (P-cores) & some slower Efficiency Cores (E-cores), & it’s up to the OS to designate tasks to the right core type.
I’m running Linux 5.18.10-arch1-1. Since 5.18 Linux theoretically included Intel’s Thread Director to help assign the right core:
For a pretty simple latent Gaussian Process model:
If I set the frequency governor to ‘performance’ (
cpupower frequency-set -g performance) a latent Gaussian Process model fits in ~800 seconds.
If I set the frequency governor to ‘powersave’ the same model fits in 460 seconds…
There is no apparent thermal throttling in either scenario - reported temps are all below 60C. Fans all stay pretty quiet.
Has anyone else come across this behaviour? It’s quite possible that setting the frequency governor to ‘performance’ means the CPU is hitting a power threshold, or too many background tasks are being pushed to the P-cores, interrupting & displacing the stan chains. Or something…?
I’m just wondering if others have noticed the same, & perhaps those who understand this stuff more than I can suggest reasons why?
UPDATE: Looks like I should have been using this:
With the EPB value set to 0 the model above runs in 450 seconds.
UPDATE2: Well… after finding & enabling the setting for 'Intel® Turbo Boost Max Technology 3.0 ’ in the bios the same model now runs in 338 seconds. That’s a bit of a win! No substantial benefits for messing with any of the cpupower settings (even EPB) when measured using geekbench. The big win was the bios setting. I’ve left the rest at default.