• Login
  • Register
  • Dolphin Forums
  • Home
  • FAQ
  • Download
  • Wiki
  • Code


Dolphin, the GameCube and Wii emulator - Forums › Dolphin Emulator Discussion and Support › Development Discussion v
« Previous 1 ... 46 47 48 49 50 ... 117 Next »

Dolphin ICC Intel optimized builds (SSE3/4/AVX) (Latest:3.5-420 x64) [UNOFFICIAL]
View New Posts | View Today's Posts

Pages (5): 1 2 3 4 5 Next »
Thread Rating:
  • 6 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Thread Modes
Dolphin ICC Intel optimized builds (SSE3/4/AVX) (Latest:3.5-420 x64) [UNOFFICIAL]
02-03-2013, 01:27 AM (This post was last modified: 02-22-2013, 06:48 AM by leetminiwheat.)
#1
leetminiwheat Offline
Junior Member
**
Posts: 28
Threads: 1
Joined: Jan 2013
Brick 
[color=#ff0000]DISCLAIMER: These are UNOFFICIAL Dolphin builds and come with no support from the Dolphin team, do not report bugs to them when using these builds (If you find a bug, test an official build first). These builds are compiled from experimental development source code with my own optimizations which may at times break things. Use at your own risk. Official Dolphin builds can be downloaded here.[/color]

I'm sharing these ICC Intel-only Windows builds in case anyone finds them useful. I've been rebuilding every few commits (Using completely new source and re-applying all optimizations again manually). The intent is to make Dolphin a little bit faster by using Intel's compiler with architecture specific optimizations, but there may or may not be a speed difference. Feel free to test and/or offer feedback, thanks. Also don't forget to add dsp_coef.bin and dsp_rom.bin to User/GC/ folder if you use LLE audio.


Build Software Used:

Spoiler: (Show Spoiler)
Microsoft Visual Studio 2010 Professional
Intel® C++ Studio XE 2013 (with Intel® C++ Compiler XE 13.1)
Optimizations applied:
Spoiler: (Show Spoiler)
SSSE3,SSE4.1,SSE4.2,AVX builds (Supports Core 2 Duo and later):
All projects except Languages and SCMRevGen:
  • Enable Enhanced Instruction Set: None, overridden by default Intel codepath in /QaxSSSE3
  • Processor-Optimized Code Path: Intel® Core™ processor family with Intel® Advanced Vector Extensions support (/QaxAVX) ((this generates optional codepaths for SSE4.1, SSE4.2, and AVX where applicable depending on your CPU)
  • Intel Processor-Specific Optimization: Intel® Core™2 processor family with Supplemental Streaming SIMD Extensions 3 (SSSE3) (/QxSSSE3) ((this becomes the default base codepath))
  • OpenMP Support: Generate Parallel Code (/Qopenmp)
  • Command Line: /MP /Qstd=c++0x
Dolphin/DSPTool projects only:
  • Optimize for windows .EXE application (/GA)
VideoCommon project:
  • _M_SSE=0x402 added to "C/C++" -> "Preprocessor" -> "Preprocessor Definition" for hand coded SSE3/SSE4 optimizations.
VideoOGL project (same as a above except these):
  • Use Visual C++ Compiler (ICC is supposedly slower on this)
  • Command Line: /MP /favor:INTEL64
AVX Builds (discontinued, these builds showed no speed improvement over the above SSSE3,SSE4.1,SSE4.2,AVX build):
All projects except Languages and SCMRevGen:
  • Enable Enhanced Instruction Set: None, overridden by default Intel codepath in /QaxAVX
  • Processor-Optimized Code Path: None, this build already has a default codepath for the AVX, adding older codepaths is pointless.
  • Intel Processor-Specific Optimization: Intel® Core™
    processor family with Intel® Advanced Vector Extensions support
    (/QaxAVX)
  • OpenMP Support: Generate Parallel Code (/Qopenmp)
  • Command Line: /MP /Qstd=c++0x
Dolphin/DSPTool projects only:
  • Optimize for windows .EXE application (/GA)
VideoCommon project:
  • "_M_SSE=0x402" added to "C/C++" -> "Preprocessor" -> "Preprocessor Definition" for hand coded SSE3/SSE4 optimizations.
VideoOGL project ONLY (same as above except these):
  • Use Visual C++ Compiler (ICC is supposedly slower on this)
  • Command Line: /MP /favor:INTEL64
AMD builds
  • coming soon
Download (full revision changelogs listed here):
  1. AVX+OpenMP builds require a 2nd gen Core CPU or later (Sandy or Ivy bridge i3, i5, i7). These builds have been discontinued since they showed no speed improvement.
  2. SSSE3,SSE4.1,SSE4.2,AVX+OpenMP builds include codepaths for all CPU's Core 2 Duo or later
  3. AMD - possibly coming soon.
  • 420: Revision: 29d43ef89727 has some game ini updates. I messed up my original attempted zelda SS fix so I fixed that (hopefully). Uploading my usual ICC builds, plus a standard vanilla MSVC build with the Zelda-SS-Fix which should run on both AMD or Intel.
Dolphin 3.5-420 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
Dolphin 3.5-420 [Zelda-SS-Fix] x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
Dolphin 3.5-420 [Zelda-SS-Fix] x64 MSVC Intel+AMD
  • 419: Another DX11 fix
Dolphin 3.5-419 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
Dolphin 3.5-420 [Zelda-SS-Fix] x64 MSVC SSE3,SSE4 Intel+AMD
  • 416: Uploading OpenMP and Non-OpenMP builds. Please note this does NOT effect the OpenMP texture decoder, that is still there in all builds, my OpenMP builds have all of Dolphin OpenMP enabled where ICC feels it might benefit from parallelization. Please test the difference between OpenMP and non-OpenMP. Also uploading builds with a potential fix for Zelda Skyward Sword crash on silent realms. Issue 5682
Dolphin 3.5-416 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
Dolphin 3.5-416 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX
Dolphin 3.5-416 [Zelda-SS-patch] x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP (download removed, patch was bad. fixed builds will be above.)
Dolphin 3.5-416 [Zelda-SS-patch] x64 ICC SSSE3,SSE4.1,SSE4.2,AVX (download removed, patch was bad. fixed builds will be above.)
  • 413: Revision 19ab5bf50d51 fixes crashes in some games
Dolphin 3.5-413 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
  • 412: a few stuffs changed
Dolphin 3.5-412 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
  • 402: few worthwhile changes... check commit logs for details. no AVX-only builds anymore, showed no speed difference. will clean up this post next time around probably, leaving stuff here for now. also, 428-real-wiimote-scanning has some critical fixes for windows.
Dolphin 3.5-402 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
Dolphin [real-wiimote-scanning] 3.5-428 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
  • 397: Only a couple Linux/OSX fixes, not worth building another master. Updated real-wiimote-scanning branch since it has a possible windows-fix. real-wiimote-scanning still synced up to 393 master. also did a test for AVX-Only build vs SSSE3,SSE4.1,SSE4.2,AVX vs vanilla there was basically zero difference. result is here. feel free to show me your results if you think there is a bigger difference, for now it doesn't seem worth building AVX-Only or SSE4-Only builds.

Dolphin [real-wiimote-scanning] 3.5-424 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP

Dolphin [real-wiimote-scanning] 3.5-423 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
  • 395: Two minor changes, not really worth upgrading from 393. Testing new hand-coded SSE3/SSE4 optimizations in VideoCommon for potential speed improvement, but the automatic compiler optimizations should already be doing a better job. Can't hurt though.
Dolphin 3.5-395 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP
Dolphin 3.5-395 x64 ICC (Intel C++ Compiler XE 13.1) SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
Dolphin [real-wiimote-scanning] 3.5-420 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX + OpenMP (based on 393 master, includes some wiimote fixes and automatic wiimote pairing)
Dolphin [FIFO-BP] 3.5-339 x64 ICC SSSE3,SSE4.1,SSE4.2,AVX (this was requested. based on a much older master, not entirely sure what it's supposed to fix. I heard on IRC that it's a bit broken on dual core mode, but I haven't tested.)
  • 393: Use different reply delays for various DI commands. (Fixes RE0. Also I will no longer be doing O3 builds since they proved to be slower in my test, unless someone can show some improvement in other games (run with framelimit:off and compare fps))
Dolphin 3.5-393 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP
Dolphin 3.5-393 x64 ICC (Intel C++ Compiler XE 13.1) SSSE3,SSE4.1,SSE4.2,AVX + OpenMP
  • 392: Merge branch 'mipmap_fixes'. (possible speed improvements on video rendering, plus some other bug fixes)
Dolphin 3.5-392 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP (mirror)
Dolphin 3.5-392 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP + O3 (mirror)
Dolphin 3.5-392 x64 ICC (Intel C++ Compiler XE 13.1) SSE4.2 + OpenMP (mirror)
Dolphin 3.5-392 x64 ICC (Intel C++ Compiler XE 13.1) SSE4.2 + OpenMP + O3 (mirror)
  • 380: Metroid fixes from Revision 9cbfddd7883b
Dolphin 3.5-380 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP
Dolphin 3.5-380 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP + O3
Dolphin 3.5-380 x64 ICC (Intel C++ Compiler XE 13.1) SSE4.2 + OpenMP
Dolphin 3.5-380 x64 ICC (Intel C++ Compiler XE 13.1) SSE4.2 + OpenMP + O3
  • 375: Wiimote issues probably fixed by Revision 937d9e900717
Dolphin 3.5-375 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP
Dolphin 3.5-375 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP + O3
Dolphin 3.5-375 x64 ICC (Intel C++ Compiler XE 13.1) SSE4.2 + OpenMP
Dolphin 3.5-375 x64 ICC (Intel C++ Compiler XE 13.1) SSE4.2 + OpenMP + O3
  • 374: Alternate wiimote timing issue possibly fixed by Revision d5ec631337c7
Dolphin 3.5-374 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP
Dolphin 3.5-374 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP + O3
  • 368: Alternate wiimote timing still gone
Dolphin 3.5-368 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP
Dolphin 3.5-368 x64 ICC (Intel C++ Compiler XE 13.1) AVX + OpenMP + O3
  • 367: Alternate wiimote timing was removed, may be buggy in some games requiring it.
Dolphin 3.5-367 x64 ICC AVX + OpenMP
Dolphin 3.5-367 x64 ICC AVX + OpenMP + O3

Dolphin 3.5-358 x64 ICC AVX + OpenMP
Dolphin 3.5-358 x64 ICC AVX + OpenMP + O3

Dolphin 3.5-356 x64 ICC AVX + OpenMP

Dolphin 3.5-350 x64 ICC AVX + OpenMP
IRC: Aristar @ irc.freeenode.net #dolphin-emu
CPU: Intel Core i5 2500K + Corsair H60 (dead)
GPU: Nvidia GeForce GTX 690 (EVGA)
MBD: Asus Maximus IV Gene-Z (Z68)
RAM: G.Skill RipjawsX 16GB (4x4GB) @ 2133MHz
SND: X-Fi Titanium Fatal1ty Pro
SSD: Samsung 830 128GB
PSU: Tt Toughpower Grand (750w Gold)
Find
Reply
02-14-2013, 05:50 AM
#2
Ashbringer Offline
Junior Member
**
Posts: 22
Threads: 4
Joined: Apr 2009
Do any of these builds work with AMD processors? I have a Bulldozer and can't get any of these to even start. Double click and nothing happens, not even a error.
AMD FX-4100 @ 4.2 Ghz
8 gigs ram
AMD Radeon 6750 1GB
Windows 7 X64
Find
Reply
02-14-2013, 10:48 AM
#3
delroth Offline
Making the world a better place through reverse engineered DSP firmwares
**********
Developers (Some Administrators and Super Moderators)
Posts: 1,354
Threads: 63
Joined: Aug 2011
Quote:I'm sharing these ICC Intel-only Windows builds

Please read the post before asking such questions.
Pierre "delroth" Bourdon - @delroth_ - Blog

<@neobrain> that looks sophisticated enough to not be a totally dumb thing to do
Website Find
Reply
02-15-2013, 12:47 AM (This post was last modified: 02-15-2013, 12:51 AM by leetminiwheat.)
#4
leetminiwheat Offline
Junior Member
**
Posts: 28
Threads: 1
Joined: Jan 2013
Intel Processor-Specific Optimizations means they won't run on AMD, however I could offer separate builds without these but ICC compiled binaries used to take the slowest codepath if it didn't detect GenuineIntel in the CPUID. Is this still the case on newer Intel C++ compilers? If the GenuineIntel patch is still required, feel free to provide it and I'll make AMD builds too.

Also keep in mind that in my own testing these builds have been within 3%-5% speed of the official builds (with Intel optimizations), so AMD-specific builds being less optimized may show no difference at all. I've only done testing on my fast CPU so maybe slower CPU's show a bigger difference, I don't know... but sometimes just a few percent can make the difference between something being playable or not.
IRC: Aristar @ irc.freeenode.net #dolphin-emu
CPU: Intel Core i5 2500K + Corsair H60 (dead)
GPU: Nvidia GeForce GTX 690 (EVGA)
MBD: Asus Maximus IV Gene-Z (Z68)
RAM: G.Skill RipjawsX 16GB (4x4GB) @ 2133MHz
SND: X-Fi Titanium Fatal1ty Pro
SSD: Samsung 830 128GB
PSU: Tt Toughpower Grand (750w Gold)
Find
Reply
02-15-2013, 09:11 AM (This post was last modified: 02-15-2013, 09:12 AM by crimeinal.)
#5
crimeinal Offline
Junior Member
**
Posts: 14
Threads: 0
Joined: Oct 2009
I really appreciate these builds. The Last Story runs at a solid frame rate in pretty much every situation now.
Intel i5 3570k @ 4.2ghz
Radeon HD7750 OC
8gb Corsair 'Vengance' ram
Windows 7 Ultimate X64
Find
Reply
02-15-2013, 05:17 PM
#6
lamedude Offline
Senior Member
****
Posts: 360
Threads: 7
Joined: Jan 2011
If you're not adding an /arch switch with /Qax then the baseline path will be SSE2 which non-Intel CPUs will take. Anger Fog's optimizing manual has a way to patch that out from the source code but I don't have an AMD CPU to test with so I haven't bothered. This will patch it the exe.
Website Find
Reply
02-16-2013, 04:55 PM (This post was last modified: 02-16-2013, 04:56 PM by leetminiwheat.)
#7
leetminiwheat Offline
Junior Member
**
Posts: 28
Threads: 1
Joined: Jan 2013
new builds up, 402-master and 428-real-wiimote-scanning

(02-15-2013, 05:17 PM)lamedude Wrote: If you're not adding an /arch switch with /Qax then the baseline path will be SSE2 which non-Intel CPUs will take. Anger Fog's optimizing manual has a way to patch that out from the source code but I don't have an AMD CPU to test with so I haven't bothered. This will patch it the exe.
are you sure that's still the behavior on ICC version 11 and higher? hmm, well I guess it can't hurt. Thanks, I will maybe give it a try whenever real-wiimote-scanning branch gets merged into master (maintaining two builds is already getting to be a PITA). Probably will do /Arch:SSE3 with /Qax:AVX for the AMD build that way it scales up from SSE3 to AVX (there's no /arch:SSSE3 unfortunately otherwise I'd do that).
IRC: Aristar @ irc.freeenode.net #dolphin-emu
CPU: Intel Core i5 2500K + Corsair H60 (dead)
GPU: Nvidia GeForce GTX 690 (EVGA)
MBD: Asus Maximus IV Gene-Z (Z68)
RAM: G.Skill RipjawsX 16GB (4x4GB) @ 2133MHz
SND: X-Fi Titanium Fatal1ty Pro
SSD: Samsung 830 128GB
PSU: Tt Toughpower Grand (750w Gold)
Find
Reply
02-17-2013, 05:15 PM
#8
lamedude Offline
Senior Member
****
Posts: 360
Threads: 7
Joined: Jan 2011
The AMD settlement just made Intel put a notice up that's on almost ICC every page; the GenuineIntel check is still there.
All the /Q[a]x options do have an /arch counterpart but the ones not listed in the IDE would only be useful for Bobcat and VIA CPUs. All other pre-BD AMD CPUs only went up SSE3.
Website Find
Reply
02-17-2013, 11:57 PM (This post was last modified: 02-17-2013, 11:57 PM by ExtremeDude2.)
#9
ExtremeDude2 Online
Gotta post fast
*******
Posts: 9,315
Threads: 273
Joined: Dec 2010
lamedude already posted that patch >.>
Check out my videos (dead)
[Image: sig-22354.png]
Website Find
Reply
02-20-2013, 04:31 AM (This post was last modified: 02-20-2013, 05:13 AM by etking.)
#10
etking Offline
Banned
Posts: 189
Threads: 1
Joined: Feb 2012
A short test showed that your 3.5-413 ICC build runs faster than the 3.5-413 unoptimized official build on my mobile Corei7 SB. Tested in Zelda SS.

On the Sandship, both builds start with 28FPS. If you go outside, the standard build drops to 23 FPS sometimes, while the ICC build only drops to 26 FPS which is great. Zelda SS never ran faster but unfortunately the silent realm crashes in DX9 and 11 are unfixed and OpenGL runs much slower.

EDIT: typo corrected
Find
Reply
« Next Oldest | Next Newest »
Pages (5): 1 2 3 4 5 Next »


  • View a Printable Version
  • Subscribe to this thread
Forum Jump:


Users browsing this thread: 1 Guest(s)



Powered By MyBB | Theme by Fragma

Linear Mode
Threaded Mode