Dolphin, the GameCube and Wii emulator - Forums

Full Version: [Quick tip] 20% performance improvement for hyperthreaded dual-core CPUs (i3, etc)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I got a new i3-4160 and ran the POV-Ray benchmark to compare it with the official results.
I was somewhat disappointed:
[Image: hcQL9k4.png]
but some searching revealed that this is a known issue. Brief explanation: You need to restrain Dolphin to two logical cores (representative of the two real cores in your chip), rather than letting its threads stray across four logical cores. Having set this affinity correctly, the result was improved by ~20%:
[Image: Z0Onzgz.png]

Under Windows, create a convenient shortcut for starting Dolphin with the correct affinity by making a .bat file in the same directory as Dolphin.exe, and putting the following in it:
Code:
@ECHO off
START /AFFINITY 5 Dolphin.exe %*
Then start Dolphin in the future from this file rather than Dolphin.exe.

Notes:
* You can also disable hyperthreading entirely in your BIOS/UEFI, but I don't recommend it- although it slows down Dolphin and a handful of similar applications, hyperthreading is a huge net gain in current native games and general desktop multitasking. Best to just set the affinity.

* I have't tested things on Linux yet, and I don't have access to OSX.

* I don't think laptop i3s are hyperthreaded, so this probably does not apply to those. If your CPU has less than three logical cores, this fix will make things slower instead.

* Perhaps Dolphin developers could consider implementing a fix ensuring Dolphin's threads aren't put up for scheduling on more logical threads than there are physical cores? I'm not entirely familiar with Intel's hyperthreading, but I do know that restraining the affinity results in a significant speedup on my 2C/4T CPU.
This is a bit of an odd result given that the POV-Ray benchmark only uses one core... did you makes sure to check the standard deviation between runs to be sure it wasn't thermal throttling or some other factor?
This is why people recommend i5 CPUs over i7 CPUs for Dolphin: the lack of HT improves performance. AFAIK HT presents the logical cores as physical cores to the OS, so there's no way for Dolphin to know which cores are "real" and which are "fake" for the purposes of improving performance. The only way to do it is to either disable HT (at the expense of other programs that may benefit from having extra threads) or force Dolphin to use specific cores every time you run it, as you've done.
(05-12-2015, 09:09 AM)Fiora Wrote: [ -> ]This is a bit of an odd result given that the POV-Ray benchmark only uses one core... did you makes sure to check the standard deviation between runs to be sure it wasn't thermal throttling or some other factor?
Although I admit I wasn't entirely scientifically rigorous, it seems consistent, and it happens in games too. 14% improvement in Double Dash, for example- I can watch it hover around 320% speed when restrained to cores #0 and #2, then fall to ~280 almost immediately after returning it to all four.

Back in POVRay, sure enough, going from one core to two doesn't affect performance much, but going from two to four does. It seems to matter which cores I lock it to- if I lock it to #0 and #1, performance plummets to far less than with #0 only. I suppose that confirms that cores #0 and #1 are the two threads on the first physical core, and #2 and #3 are the two threads from the second... I wasn't 100% clear on that.

I will point out that I'm using Windows 10- I doubt there are any sudden severe thread scheduling bugs in it, though. I'll test it in Ubuntu 15.04 soon.
Have you checked the clock frequency during the tests? I wonder if Windows is being particularly terrible with the scheduling such that it ends up with a lower clock rate from turbo boost (because of the core switching) Undecided
AFAIK Core i3 CPUs doesn't have Turbo Boost...
Quote:I don't think laptop i3s are hyperthreaded
Mobile i3 are the same as desktop i3 (2 cores 4 threads)
All Mobile i5 are dual core with HT plus turbo boost . Desktop i5 are either dual core with HT or quad core (depends on the model number)
Mobile i7 are either dual core with HT or quad core with HT . Desktop i7 are quad core with HT

So it might work on both mobile i3 and mobile i5 (idk , I haven't tried it yet)
(05-12-2015, 01:23 PM)Fiora Wrote: [ -> ]Have you checked the clock frequency during the tests? I wonder if Windows is being particularly terrible with the scheduling such that it ends up with a lower clock rate from turbo boost (because of the core switching) Undecided
Clocks are the same, no turbo on desktop i3.

(05-12-2015, 03:25 PM)admin89 Wrote: [ -> ]it might work on both mobile i3 and mobile i5
Thanks for the info there- warrants checking by someone with those CPUs, perhaps. I'll edit the title to be a bit vaguer.
/AFFINITY 5
Doesn't that mean that Dolphin will use up to 5 "virtual" cores?
how many threads can Dolphin actually utilize ?

I know I was at least never able to use 100% out of 4 cores and games were slowing down even at 50-60% cpu.

Qaazavaca Qaanic

NO!
Affinity's a hex mask 0x5 = 0b...101.
If the physical cores are grouped together in the logical core numbers, then it assigns to logical cores 0 and 2, or physical cores 0 and 1.
Pages: 1 2