Dolphin, the GameCube and Wii emulator - Forums

Full Version: [UNOFFICIAL] AMD/ATI GPU Performance Guide feat. Dolphin
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6
*******************************************************
AMD/ATI GPU Performance Guide feat. Dolphin - v0.995 (Dec 19, 2015)
*******************************************************

NOTE: MS Windows (7, 8, 8.1, 10) only [for now]. GNU/Linux (Ubuntu) update coming soon.

Recommended GPUs
=================
Any modern GPU (with a full Direct3D11 feature set) supported by the latest AMD display drivers:

* ATI Radeon HD 5000 series
* AMD Radeon HD 6000 Series
* AMD Radeon HD 7000 Series
* AMD Radeon HD 8000 Series
* AMD Radeon Rx 200 Series
* AMD Radeon Rx 300 Series
* AMD Radeon R9 Fury/Nano Series
* Integrated graphics (on AMD APUs) based on the same GPU architecture(s)

Drivers
=======
* Uninstall your old GPU drivers:

Use Display Driver Uninstaller (DDU) and choose the option to reboot and uninstall the GPU drivers in Safe Mode (recommended).
Download it from here:
http://www.wagnardmobile.com/DDU/

* Update your GPU drivers to the latest stable version:

Download and install the AMD Catalyst 15.11.1 BETA drivers from this link:
http://www2.ati.com/drivers/beta/amd-cat...-nov14.exe

NOTE: Avoid the buggy "Crimson" drivers from the official web site (http://support.amd.com/) for now until they're fixed.
Make sure you select the correct driver package for your OS.

Catalyst Control Center Settings
=========================
* In the "3D settings" section, set the Texture Filtering Quality to "High Quality" and turn off "Surface Format Optimizations". Leave everything else at defaults.
This will increase stability, reduce graphical artifacts and improve the image quality.
[Image: attachment.php?aid=12557]

* Enable (unlock) AMD GPU Overdrive. This is safe, it will NOT overclock your GPU. Overdrive is required for running the GPU in high-performance mode. Then go to the "Software Info" tab and close CCC.
[Image: attachment.php?aid=12558]

High-Performance Mode
=================
High Performance mode is the only solution for:
- massive slowdown due to lazy clock switching or clocks stuck in 2D mode
- lag, tearing, macro- and micro-stutter issues due to frequent switching between power states

You can set your GPU to always run in High Performance mode by manually editing the file profiles.xml located in %SystemDrive%\Users\YourUserName\AppData\Local\ATI\ACE\ and applying the High-Performance Mode Trick.

The High-Performance Mode Trick
=========================
0. Before you begin:

- Close all 2D or 3D apps which may use the GPU, including video players and web browsers.

NOTE #1: CCC should be set to always start at the "Software Info" tab. NEVER perform the high-performance mode trick while CCC is displaying the "Overdrive" Tab - it may not work properly.

NOTE #2: You should also disable the auto-start of AMD Catalyst Control Center. For more info, see the "Windows Settings" section.

1. First, make a backup of your original profiles.xml file.

2. Open the .xml file with Notepad and look for a section of text that contains these four strings: CoreClockTarget, MemoryClockTarget, CoreVoltageTarget and MemoryVoltageTarget.
Here's a generic example:

<Feature name="CoreClockTarget_PCI_VEN_...>
<Property name="Want_0" value="2D MHz" />
<Property name="Want_1" value="UVD MHz" />
<Property name="Want_2" value="3D MHz" />
<Property name="Want_3" value="3D Boost MHz" />
</Feature>

<Feature name="MemoryClockTarget_PCI_VEN_...>
<Property name="Want_0" value="m2D MHz" />
<Property name="Want_1" value="mUVD MHz" />
<Property name="Want_2" value="m3D MHz" />
<Property name="Want_3" value="m3D Boost MHz" />
</Feature>

<Feature name="CoreVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="2D Volts" />
<Property name="Want_1" value="UVD Volts" />
<Property name="Want_2" value="3D Volts" />
<Property name="Want_3" value="3D Boost Volts" />
</Feature>

<Feature name="MemoryVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="0" />
<Property name="Want_1" value="0" />
<Property name="Want_2" value="0" />
<Property name="Want_3" value="0" />
</Feature>

Depending on your GPU, you may have 2 or more power states ("Property name= Want_X" lines) for each Feature.

3. Delete all lines with the intermediate power states (if you have any, depends on your GPU). The min. and max. is all you need.
The modified generic example should look like this:

<Feature name="CoreClockTarget_PCI_VEN_...>
<Property name="Want_0" value="2D MHz" />
<Property name="Want_3" value="3D Boost MHz" />
</Feature>

<Feature name="MemoryClockTarget_PCI_VEN_...>
<Property name="Want_0" value="m2D MHz" />
<Property name="Want_3" value="m3D Boost MHz" />
</Feature>

<Feature name="CoreVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="2D Volts" />
<Property name="Want_3" value="3D Boost Volts" />
</Feature>

<Feature name="MemoryVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="0" />
<Property name="Want_3" value="0" />
</Feature>

4. Now replace the the min. value in the first line with the max. value from the second line. Do the same for all 4 features.
The modified generic example should look like this:

<Feature name="CoreClockTarget_PCI_VEN_...>
<Property name="Want_0" value="3D Boost MHz" />
<Property name="Want_3" value="3D Boost MHz" />
</Feature>

<Feature name="MemoryClockTarget_PCI_VEN_...>
<Property name="Want_0" value="m3D Boost MHz" />
<Property name="Want_3" value="m3D Boost MHz" />
</Feature>

<Feature name="CoreVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="3D Boost Volts" />
<Property name="Want_3" value="3D Boost Volts" />
</Feature>

<Feature name="MemoryVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="0" />
<Property name="Want_3" value="0" />
</Feature>

5. Finally, adjust the numbering of the lines, so the first line will always have a 0 (Want_0) and the second line will have a 1 (Want_1)
The FINAL modified generic example should look like this:

<Feature name="CoreClockTarget_PCI_VEN_...>
<Property name="Want_0" value="3D Boost MHz" />
<Property name="Want_1" value="3D Boost MHz" />
</Feature>

<Feature name="MemoryClockTarget_PCI_VEN_...>
<Property name="Want_0" value="m3D Boost MHz" />
<Property name="Want_1" value="m3D Boost MHz" />
</Feature>

<Feature name="CoreVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="3D Boost Volts" />
<Property name="Want_1" value="3D Boost Volts" />
</Feature>

<Feature name="MemoryVoltageTarget_PCI_VEN_...>
<Property name="Want_0" value="0" />
<Property name="Want_1" value="0" />
</Feature>

6. Now start both GPU-Z (download the latest version from TechPowerUp!) and the Windows Task Manager.

7. Start Catalyst Control Center manually (right click on the Windows desktop and select the option from the context menu).
Check the GPU Clock and Default Clock in the "Graphics Card" Tab of GPU-Z (if your card is overclocked, the "GPU Clock" values should differ from the defaults. Otherwise, they should be the same as the "Default" clocks).
Close CCC, wait 30sec ~ 1min. and then *end* the CCC.exe and MOM.exe processes with the Task Manager.

8. Now click on the "Sensors" Tab in GPU-Z and start CCC manually again while watching for any changes in the GPU Core Clock and GPU Memory Clock graphs. The card should now go into high-performance mode and clocks should switch to max. boost state.
Quickly close CCC after it starts, wait 30sec ~ 1min., end the CCC.exe and MOM.exe processes with the Task Manager and finally mouse over the system tray to clear the lingering CCC icons. That's it. The clocks should now stick until you reboot.

9. Next time you want to enable High-Performance mode (after a reboot), you only need to perform steps 7. and 8.

Windows Settings
==============
* Disable the auto-start of the AMD Catalyst Control Center application (CCC is a buggy and unpredictable resource hog).
Windows 8, 8.1, 10 users can do this easily with the new Task Manager.
Windows 7 users should use a third-party app (CCleaner v4.17):
[Image: attachment.php?aid=12560]
Then reboot the PC.
If you ever need CCC to adjust some settings, you can always start it manually by right-clicking on the Windows Desktop.

* For Windows 7 users only:
- For lower input latency and improved performance in Borderless Fullscreen and Windowed mode, it's recommended to disable Desktop Composition (DWM / Aero).
[Image: attachment.php?aid=12559]
NOTE: When Desktop Composition is disabled, always set the Windows Taskbar to "Auto-Hide" . Leaving the Taskbar always visible (the default setting) may negatively affect your performance in Borderless Fullscreen and Windowed mode.

Dolphin Settings
=============
* Make sure you're using the latest development build of Dolphin. You can download it from here: https://dolphin-emu.org/download/list/master/1/
The latest build will give you optimal performance, improved accuracy and the latest feature set.

* Always use the (incredibly fast on AMD) Direct3D backend. OpenGL (even with the GL_AMD_Pinned_Memory extension) is still slow on AMD GPUs, especially with CPU->EFB Access and EFB to RAM.
[Image: attachment.php?aid=12561]

* Use "Borderless Fullscreen" instead of "Exclusive Fullscreen":
[Image: attachment.php?aid=12562]
The Exclusive FS mode in Dolphin is a known CPU hog. It increases the CPU load, leading to a noticeable drop in performance.

* Do *NOT* enable the "Per-Pixel Lighting" option. It's a HUGE resource hog. When this enhancement is active, Dolphin generates an order of magnitude more shaders, which greatly increases the load on its (inefficient) shader compiler and leads to massive stuttering.
[Image: attachment.php?aid=12572]

* Anisotropic Filtering is best left at the default setting (1x). It introduces graphical artifacts and at higher internal resolutions it's just a waste of resources (no improvement in image quality).
[Image: attachment.php?aid=12571]
But AF is useful in those rare cases where you want to improve the appearance of unfiltered textures.
Using 16xAF instead of the "Force Filtering" option will result in much better image quality.

* Higher internal resolution and Anti-Aliasing (for users with decent GPUs who care about image quality):
For a 1080p display, most users would choose 3xIR and enable Standard AA (usually 4x MSAA, EQAA or CSAA) when using Direct3D (since SSAA is available only with the OpenGL backend). But that's *not* the optimal setting for AMD GPUs. Standard AA has a negative effect on performance (major slowdown) and doesn't improve the image quality that much.
A smarter and superior way to do AA in Dolphin is to increase the internal resolution to 6x and *disable* the standard Anti-Aliasing.
6xIR is the equivalent of 3xIR + 4xSSAA (or 4K downsampled to 1080p), so it produces a higher quality image than standard MSAA and improves the performance at the same time.
For other screen resolutions, disable the standard AA and use an IR that's 2x the optimal IR for your display. For a 2560x1440 screen, that means (4xIR)x2 = 8xIR.

* Using an IR greater than 2x your screen resolution (such as 7xIR or higher for 1080p) is a waste of GPU resources for a minimal or no improvement in image quality. It may even degrade image quality in some cases.

* If you're still experiencing unusually low performance, you should try Dolphin with VSync disabled. *Some* AMD GPU/driver combinations don't like this setting (there's a massive performance drop with VSync ON):
[Image: attachment.php?aid=12565]

* If you still have stuttering issues even in high-performance mode, that's most likely due to the way real-time shader cache compilation works in Dolphin.
To reduce this unwanted effect, try one of these three solutions:

1. Use Tino's unofficial Ishiiruka build:
[Image: attachment.php?aid=12566]
https://forums.dolphin-emu.org/Thread-un...om-version
with the Direct3D11 backend and "Full Async Shader Compilation" enabled.
NOTE: If you want to use Ishiiruka alongside the latest master build, it's recommended to run it in portable mode (see the section for advanced users)

2. Switch to the slower OpenGL backend - it's less sensitive to this issue.
NOTE: The latest AMD OpenGL driver now features a shader cache, similar to the one in the nVIDIA drivers. It complements Dolphin's own shader cache and reduces stuttering. The new shader cache directory is located here: %SystemDrive%\Users\YourUserName\AppData\Roaming\AMD\GLCache\

3. Just run Dolphin multiple times until the shader cache builds up completely.
NOTE: Dolphin will automatically delete your shader cache if you update to a newer version, update your GPU drivers or change your hardware.


**************************
FOR ADVANCED USERS ONLY
**************************

Running Dolphin in Portable Mode (For Testing Purposes)
=========================================

1. Create a new folder on your drive (e.g. Dolphin_Test1).

2. Extract the contents of the .7z archive with the Dolphin build into that folder.

3. Create a blank file named portable.txt in the same folder.

4. Start Dolphin and then close it, so the User folder is generated.

5. Copy all your regular build saves (not the settings, just the saves) from your Documents folder to the respective dir(s) in the User folder.

6. Start Dolphin, adjust the settings to your preference and then close it again.

7. Now you can use the test build without the risk of messing up your normal Dolphin installation.


Overclocking your AMD GPU using the High Performance Trick: A Mini-Guide /!\ Proceed at your own risk /!\
========================================================================

1. Download and install MSI Afterburner 4.x from MSI's official site. During the install process, select only the main application. Do not install any extras.

2. Start the app, go to the options, find the setting "Extend the OFFICIAL CCC overclocking limit" and enable it. Close the app and reboot.

3. Uninstall MSI Afterburner. You don't need it anymore Smile

4. Overclocking is done using the High-Performance Trick. No third-party apps necessary. Set the PowerTune percentage, GPU and Memory clocks in the profiles.xml file to a value higher than the defaults (it's recommended to do this in small increments) and test for stability with a GPU stress test app (and Dolphin).

/!\ WARNING /!\
Increasing the voltage is NOT RECOMMENDED. It will damage or shorten the life of your GPU. You have been warned.

NOTE: Performing Step #7 of the High-Performance Trick will overclock your GPU, but still keep the power saving functions enabled. Step #8 additionally enables Hi-Perf mode / disables power saving.

***********
       v0.995
***********
*Placeholder for screenshots*
Quote:[color=#000000]* In the "3D settings" section, set the Texture Filtering Quality to "Very High" and disable Surface Format Optimizations. Leave everything else at defaults.[/color]

[color=#000000]* To improve 1080p/UltraHD/4K/60fps/120fps video playback speed and quality, disable all video enhancements. The only options that should be enabled are "Automatic vector adaptive deinterlacing", "Pulldown detection", "Ensure smooth video playback" and "Apply settings to internet video". [/color]
[color=#000000]* Untick the option "Automatically Check for Updates"[/color]
This doesn't do anything besides create placebo. The first one could theoretically improve performance but I don't think it does besides 1 - 2 FPS.
Quote:[color=#000000]* For improved performance and lower input latency under Windows 7, it's recommended to disable Desktop Composition (DWM / Aero / UxSms). NOTE: This is not possible in later versions of Windows (8, 8.1 or 10).[/color]

[color=#000000]* Disable the auto-start of AMD Catalyst Control Center, so it won't load on every boot. CCC is an unpredictable resource hog. To do this, use the New Task Manager (Windows 8, 8.1, 10) or a third-paty app like CCleaner v4.17 (Windows 7). Then reboot the PC.[/color]
[color=#000000]* Disable the AMD FUEL Service: Go to Administrative Tools -> Services, find the service, right-click,select "Properties" and set auto-start to "disabled".[/color]
This won't do anything either. Desktop Compostion may theoretically hinder performance, but we have exclusive fullscreen to take care of that. Messing with System Services is the wrong way to improve performance and it will just create other issues. 
Quote:[color=#000000]* Always use the (incredibly fast on AMD) Direct3D backend. OpenGL (even with the Pinned_Memory extension) is still dog slow on AMD GPUs.[/color]
Last time I checked, OpenGL was a lot faster or on par with D3D

Obviously something happened while I was gone.
Quote:[color=#000000]* Use "Borderless Fullscreen" instead of "Exclusive Fullscreen". The Exclusive FS implementation in Dolphin still has some unresolved issues - it doesn't work every time, and when it does, it increases the CPU load, leading to a noticeable drop in performance (CPU hog).[/color]
You should open a new issue on Googlecode about this, I don't experience this issue but maybe some others do.

Quote:[color=#000000]Using an IR greater than 2x your screen resolution (such as 7xIR or higher for 1080p) is a waste of GPU resources and will NOT improve image quality.[/color]
Yes it will. The image will get downsampled from that resolution which is basically Supersampling. It will remove jaggies, but there shouldn't be any at that point. However, it's still wrong to say that it won't improve image quality.


Besides that, your Linux guide should just say to get a NVIDIA card, because that's basically everything you can do if you want to improve your performance on Linux. I'm not kidding. You either get a reliable driver, or a fast one that's still miles away from the performance on Windows.
(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]The first one could theoretically improve performance but I don't think it does besides 1 - 2 FPS.

The first set of tweaks does quite the opposite. It improves driver stability , reduces artifacts in Dolphin and increases image quailty at the expense of some performance.

(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]This won't do anything either. Desktop Compostion may theoretically hinder performance, but we have exclusive fullscreen to take care of that.

This is mainly for improving performance in Borderless Fullscreen (the recommended setting for better performance) and Windowed mode.

(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]This doesn't do anything besides create placebo.

Messing with System Services is the wrong way to improve performance and it will just create other issues. 

Removed the "placebo" tweaks.

(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]Last time I checked, OpenGL was a lot faster or on par with D3D

Obviously something happened while I was gone.

A lot has changed since then (for the worse for users with AMD GPUs).

(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]You should open a new issue about this, I don't experience this issue but maybe some others do.

It needs some testing first (I'll create a new thread in the dev forum)

(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]Yes it will. The image will get downsampled from that resolution which is basically Supersampling. It will remove jaggies, but there shouldn't be any at that point. However, it's still wrong to say that it won't improve image quality.

By not improving image quality, I mean it may actually look even worse.

(12-29-2014, 01:54 AM)Anti-Ultimate Wrote: [ -> ]Your Linux guide should just say to get a NVIDIA card.

...for now.

The Open Source radeon / radeonsi driver and mesa/gl/llvm are improving at a rapid pace... a brand new 'amdgpu' driver that looks promising... Catalyst with full OpenGL 4.5 support, tons of bug fixes and performance improvements... lightweight Lubuntu with a Qt DE with less overhead and an improved user experience...
Who knows? Things may look very different when the next version of Ubuntu is released.

.
Performance Guide updated to v0.90
seems fine now
(12-28-2014, 11:10 PM)kirbypuff Wrote: [ -> ]* Always use the (incredibly fast on AMD) Direct3D backend.  OpenGL (even with the Pinned_Memory extension) is still dog slow on AMD GPUs.

Okay, quick question: what game would you recommend that reveals this fact? I have tried D3D for Melee, Brawl, and now 1080 Avalanche, with 4-players, and I can't help but notice micro and macro stuttering every 5 seconds.

My specs are as follows:

CPU -> i7 3930k OC'd to 4.5 GHz
RAM -> 16 GB 2400 MHz
MB -> Rampage IV Extreme (OC'd a lot of stuff)
GPU -> 2x Sapphire 7870 GHz Edition

When I use OpenGL, it works perfectly. No stutters, and in fact, runs smoother in a lot of cases compared to D3D.

P.S. I changed all my settings to suit your configuration, and D3D still runs like it is unoptimized. Is it due to the core structure of the 7870's? Are you referring to the new R9 series for better D3D performance?
1. Patience pays off: Wait for the next update to the performance guide (v0.95) *coming soon*

2. Disable Xfire (use only one GPU). Dolphin doesn't support multi-GPU.

3. Are you using MSI Afterburner or other similar tools to OC your GPUs?

4. Go to %SystemDrive%\Users\YourUserName\AppData\Local\ATI\ACE\ and post the contents of your profile.xml file.

5. Is your GPU really running in high-performance mode? Check with GPU-Z.

6. The 7870 GHz Edition is the same card as a non-reference R9 270. The only difference is the memory speed:
7870 GHz has its RAM clocked at 4800MHz
R9 270 has faster RAM (5600 MHz). Even better, the memory chips are actually underclocked and rated to work at 6000 MHz.
(12-29-2014, 03:50 AM)kirbypuff Wrote: [ -> ]1. Patience pays off: Wait for the next update to the performance guide (v0.95) *coming soon*

2. Disable Xfire (use only one GPU). Dolphin doesn't support multi-GPU.

3. Are you using MSI Afterburner or other similar tools to OC your GPUs?

4. Go to %SystemDrive%\Users\YourUserName\AppData\Local\ATI\ACE\ and post the contents of your profile.xml file.

5. Is your GPU really running in high-performance mode? Check with GPU-Z.

6. 7870 GHz = R9 270 with slower memory.

1. Thanks! Big Grin

2. I might try this, but you know what's interesting? I actually did find a huge performance increase in F-Zero GX, especially on the sand level where there is a lot of heat distortion. I usually run on EFB to Texture, with the Texture Cache set to Fast and I disable the external frame buffer (sometimes I use virtual). What's amazing is that on D3D, F-Zero GX ran perfectly fast even with EFB set to RAM, and the Texture Cache set to safe! By the way, this is with crossfire on (I will test it with off after this posting).

3. Nope. I do have RadeonPro, but that avoids programs like Dolphin (I once forced it to work with Dolphin, but that didn't go well, because I know that Dolphin has its own tools for rendering the games).

4. From what I can see, it fits well with your configuration.

5. Yup.

6. Right, didn't know that.
Updated to v0.91
-------------------------
* various improvements
* fixed some typos
Pages: 1 2 3 4 5 6