• Login
  • Register
  • Dolphin Forums
  • Home
  • FAQ
  • Download
  • Wiki
  • Code


Dolphin, the GameCube and Wii emulator - Forums › Dolphin Emulator Discussion and Support › General Discussion v
« Previous 1 ... 115 116 117 118 119 ... 367 Next »

[Quick tip] 20% performance improvement for hyperthreaded dual-core CPUs (i3, etc)
View New Posts | View Today's Posts

Pages (2): 1 2 Next »
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Thread Modes
[Quick tip] 20% performance improvement for hyperthreaded dual-core CPUs (i3, etc)
05-12-2015, 06:35 AM (This post was last modified: 05-13-2015, 08:50 AM by IVCrumpet.)
#1
IVCrumpet Offline
Junior Member
**
Posts: 40
Threads: 2
Joined: Apr 2015
I got a new i3-4160 and ran the POV-Ray benchmark to compare it with the official results.
I was somewhat disappointed:
[Image: hcQL9k4.png]
but some searching revealed that this is a known issue. Brief explanation: You need to restrain Dolphin to two logical cores (representative of the two real cores in your chip), rather than letting its threads stray across four logical cores. Having set this affinity correctly, the result was improved by ~20%:
[Image: Z0Onzgz.png]

Under Windows, create a convenient shortcut for starting Dolphin with the correct affinity by making a .bat file in the same directory as Dolphin.exe, and putting the following in it:
Code:
@ECHO off
START /AFFINITY 5 Dolphin.exe %*
Then start Dolphin in the future from this file rather than Dolphin.exe.

Notes:
* You can also disable hyperthreading entirely in your BIOS/UEFI, but I don't recommend it- although it slows down Dolphin and a handful of similar applications, hyperthreading is a huge net gain in current native games and general desktop multitasking. Best to just set the affinity.

* I have't tested things on Linux yet, and I don't have access to OSX.

* I don't think laptop i3s are hyperthreaded, so this probably does not apply to those. If your CPU has less than three logical cores, this fix will make things slower instead.

* Perhaps Dolphin developers could consider implementing a fix ensuring Dolphin's threads aren't put up for scheduling on more logical threads than there are physical cores? I'm not entirely familiar with Intel's hyperthreading, but I do know that restraining the affinity results in a significant speedup on my 2C/4T CPU.
Find
Reply
05-12-2015, 09:09 AM
#2
Fiora Offline
x86 JIT Princess
**********
Developers (Some Administrators and Super Moderators)
Posts: 237
Threads: 0
Joined: Aug 2014
This is a bit of an odd result given that the POV-Ray benchmark only uses one core... did you makes sure to check the standard deviation between runs to be sure it wasn't thermal throttling or some other factor?
Website Find
Reply
05-12-2015, 09:12 AM
#3
Aleron Ives Offline
Senior Member
****
Posts: 662
Threads: 7
Joined: Apr 2014
This is why people recommend i5 CPUs over i7 CPUs for Dolphin: the lack of HT improves performance. AFAIK HT presents the logical cores as physical cores to the OS, so there's no way for Dolphin to know which cores are "real" and which are "fake" for the purposes of improving performance. The only way to do it is to either disable HT (at the expense of other programs that may benefit from having extra threads) or force Dolphin to use specific cores every time you run it, as you've done.
Find
Reply
05-12-2015, 09:35 AM
#4
IVCrumpet Offline
Junior Member
**
Posts: 40
Threads: 2
Joined: Apr 2015
(05-12-2015, 09:09 AM)Fiora Wrote: This is a bit of an odd result given that the POV-Ray benchmark only uses one core... did you makes sure to check the standard deviation between runs to be sure it wasn't thermal throttling or some other factor?
Although I admit I wasn't entirely scientifically rigorous, it seems consistent, and it happens in games too. 14% improvement in Double Dash, for example- I can watch it hover around 320% speed when restrained to cores #0 and #2, then fall to ~280 almost immediately after returning it to all four.

Back in POVRay, sure enough, going from one core to two doesn't affect performance much, but going from two to four does. It seems to matter which cores I lock it to- if I lock it to #0 and #1, performance plummets to far less than with #0 only. I suppose that confirms that cores #0 and #1 are the two threads on the first physical core, and #2 and #3 are the two threads from the second... I wasn't 100% clear on that.

I will point out that I'm using Windows 10- I doubt there are any sudden severe thread scheduling bugs in it, though. I'll test it in Ubuntu 15.04 soon.
Find
Reply
05-12-2015, 01:23 PM (This post was last modified: 05-12-2015, 01:23 PM by Fiora.)
#5
Fiora Offline
x86 JIT Princess
**********
Developers (Some Administrators and Super Moderators)
Posts: 237
Threads: 0
Joined: Aug 2014
Have you checked the clock frequency during the tests? I wonder if Windows is being particularly terrible with the scheduling such that it ends up with a lower clock rate from turbo boost (because of the core switching) Undecided
Website Find
Reply
05-12-2015, 03:17 PM
#6
mbc07 Offline
Wiki Caretaker
*******
Content Creators (Moderators)
Posts: 3,562
Threads: 47
Joined: Dec 2010
AFAIK Core i3 CPUs doesn't have Turbo Boost...
Avell A70 MOB: Core i7-11800H, GeForce RTX 3060, 16 GB DDR4-3200, Windows 11 (Insider Preview)
ASRock Z97M OC Formula: Pentium G3258, GeForce GT 440, 16 GB DDR3-1600, Windows 10 (22H2)
Find
Reply
05-12-2015, 03:25 PM (This post was last modified: 05-12-2015, 03:27 PM by admin89.)
#7
admin89 Offline
Overclocker™ ✓ᵛᵉʳᶦᶠᶦᵉᵈ
*******
Posts: 6,889
Threads: 127
Joined: Nov 2009
Quote:I don't think laptop i3s are hyperthreaded
Mobile i3 are the same as desktop i3 (2 cores 4 threads)
All Mobile i5 are dual core with HT plus turbo boost . Desktop i5 are either dual core with HT or quad core (depends on the model number)
Mobile i7 are either dual core with HT or quad core with HT . Desktop i7 are quad core with HT

So it might work on both mobile i3 and mobile i5 (idk , I haven't tried it yet)
Laptop: (Show Spoiler)
Clevo W230SS : 3200x1800 IPS | i7 4700MQ @ 3.6GHz (Intel XTU + Triple fan mod) | GTX 860M GDDR5 | 128GB Toshiba CFD SSD | 16GB DDR3L 1600MHz
Aspire 715 43G : 1080p 144Hz |  R5 5625U @ 4.3GHz | Nvidia RTX 3050 4GB | 500GB WD SSD  | 16GB DDR4 3200MHz 
Mini PC :: (Show Spoiler)
G3258 @ 4.6GHz | ELSA GTX 750 | Asrock Z87E ITX | 600W SFX 80+ Gold Silverstone + SG06-LITE | Corsair Vengeance 8GB 2000MHz | Scythe Kozuti + Ao Kaze | 45TB 2.5" Ex HDD (in total) , Zelda Gold Wiimote , LE Wii Classic Controller , Gold LE PS3 DualShock , BlackWidow Chroma ,
Now Playing : Xenoblade Definitive Edition on Yuzu - Switch Emu 

 
Find
Reply
05-13-2015, 08:49 AM
#8
IVCrumpet Offline
Junior Member
**
Posts: 40
Threads: 2
Joined: Apr 2015
(05-12-2015, 01:23 PM)Fiora Wrote: Have you checked the clock frequency during the tests? I wonder if Windows is being particularly terrible with the scheduling such that it ends up with a lower clock rate from turbo boost (because of the core switching) Undecided
Clocks are the same, no turbo on desktop i3.

(05-12-2015, 03:25 PM)admin89 Wrote: it might work on both mobile i3 and mobile i5
Thanks for the info there- warrants checking by someone with those CPUs, perhaps. I'll edit the title to be a bit vaguer.
Find
Reply
05-13-2015, 02:47 PM (This post was last modified: 05-13-2015, 02:49 PM by nex86.)
#9
nex86 Offline
Member
***
Posts: 127
Threads: 10
Joined: Oct 2013
/AFFINITY 5
Doesn't that mean that Dolphin will use up to 5 "virtual" cores?
how many threads can Dolphin actually utilize ?

I know I was at least never able to use 100% out of 4 cores and games were slowing down even at 50-60% cpu.
Find
Reply
05-13-2015, 04:41 PM
#10
Qaazavaca Qaanic
Unregistered
 
NO!
Affinity's a hex mask 0x5 = 0b...101.
If the physical cores are grouped together in the logical core numbers, then it assigns to logical cores 0 and 2, or physical cores 0 and 1.
Reply
« Next Oldest | Next Newest »
Pages (2): 1 2 Next »


  • View a Printable Version
  • Subscribe to this thread
Forum Jump:


Users browsing this thread: 1 Guest(s)



Powered By MyBB | Theme by Fragma

Linear Mode
Threaded Mode