Dolphin, the GameCube and Wii emulator - Forums

Full Version: Direct3D peformance regression on AMD since the "optimizations" in 4.0-3926
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
Commit PR#1414 by kayru in 4.0-3926 has a negative effect on the Direct3D perfomance of ATI/AMD GPUs.

Quoted from the October 2014 Progress Report:
Quote:D3D: Couple of small optimizations (PR #1414) by kayru

...4.0-3926 brought *another* sizable performance boost to the D3D backend. D3D will be roughly 10% faster than before.

People who have tested this build with an NVIDIA GPU have experienced 5 to 10% performance improvement, but with AMD GPUs there's an up to 10% performance hit which is the exact opposite behavior.

Seems like AMD's GPU/driver combnation really doesn't like this change (or part of it).
OK, seriously, do you do anything other than complaining? You've started like 5 posts in one day doing nothing but whining about this or that supposed performance regression. And you are providing very little data even though what you are saying contradicts a wide variety of testing that has already been done. You should be providing specific reproducible scenarios with information on specific builds. Also, provide your full hardware specs and operating system information.

EDIT: and don't fill up the subject line. Anyone that tries to reply to you needs at least 4 characters free since MyBB insists on creating a subject for every single reply with "RE: " in front of it.
I have the same issue, even light games like Brawl/Project M dont run at full speed anymore even without the HD texture packs and it used to run perfectly with them enabled before
kayru Wrote:PR #1414 Oct 26, 2014

* Vertex and index data in one buffer
* Pixel shader resources are not reset unnecessarily

Vertex and index buffer are usually both updated for every draw call.
We can avoid some driver overhead related to mapping/unmapping by using just one buffer.
This is the same optimization as Galop1n did in #438.

Looks like AMD's GPU drivers really don't like this change (or part of it).

Some of Galop1n's tweaks do not behave well on ATI/AMD. They make the framerate unstable (fluctuating) or reduce performance.
Galop1n's custom build has this variable framerate issue as well. Master builds before this commit [PR #1414] had no such problem (the framerate was stable)

The best place to test this is the NSMB title screen or L1-1 with EFBtoRAM at high IRs (4x or 6x)
(11-02-2014, 01:30 PM)MaJoR Wrote: [ -> ]OK, seriously, do you do anything other than complaining? You've started like 5 posts in one day doing nothing but whining about this or that supposed performance regression.

Not complaining, just posting issue reports here since the respective PRs/commits on GitHub are old and already closed, so it's useless to post a comment there.
Bug/regression reports - that's the thing devs want the most, isn't it?

MaJoR Wrote:and don't fill up the subject line. Anyone that tries to reply to you needs at least 4 characters free since MyBB insists on creating a subject for every single reply with "RE: " in front of it.

Noted.


P.S. There's already one person who is also experiencing the same (or similar) performance issues.
kirbypuff Wrote:Not complaining, just posting issue reports here since the respective PRs/commits on GitHub are old and already closed, so it's useless to post a comment there.
Bug/regression reports - that's the thing devs want the most, isn't it?

You should really be posting these on the issue tracker. The devs don't pay a lot of attention to the forum for this stuff.

https://code.google.com/p/dolphin-emu/issues/list

Just remember to post TONS of information. We have to be able to reproduce it.
The Texture Hash thing won't affect GPUs, only CPUs. And if the AMD CPUs are actually slower at it, then that's because they really suck at doing their job. If they somehow lose performance on the texture loop stuff, then whatever.

Secondly, the report doesn't make sense as Project M/Brawl aren't texture hash limited games, making me doubt the validity of the regression tests. I have an AMD GPU I test with, so I know the D3D statecache actually helps iwth the AMD video cards. And I'm fairly certain the D3D statecache can't touch CPU emulation.
(11-02-2014, 06:29 PM)JMC47 Wrote: [ -> ]The Texture Hash thing won't affect GPUs, only CPUs. And if the AMD CPUs are actually slower at it, then that's because they really suck at doing their job. If they somehow lose performance on the texture loop stuff, then whatever.

Secondly, the report doesn't make sense as Project M/Brawl aren't texture hash limited games, making me doubt the validity of the regression tests. I have an AMD GPU I test with, so I know the D3D statecache actually helps iwth the AMD video cards. And I'm fairly certain the D3D statecache can't touch CPU emulation.

Tested on NSMB with EFB to RAM at 4x and 6xIR with texture cache set to 'Safe'.

The state cache (PR #475) does give a *significant* speed boost on any card (NVIDIA and AMD). It's the commit after that one - the small 'optimizations' (PR #1414) which reduces performance on AMD cards and causes the framerate to fluctuate.
So i tested 4.0-3921 and 4.0-3926 with my AMD Radeon HD7790 with D3D, and 4.0-3926 is faster for me in Wind Waker(40 -> 43 fps, around 30 fps before Kayru's 1st patch) and the New Super Mario Bros title screen(130 -> 133 fps). My Catalyst version is 14.9, my CPU is an Intel Core 2 Duo E6750 clocked at 3.2 Ghz and i'm using Win 7 64 Bit.

For me both changes improved the performance, so -1 on this regression from me.
(11-02-2014, 06:39 PM)kirbypuff Wrote: [ -> ]
(11-02-2014, 06:29 PM)JMC47 Wrote: [ -> ]The Texture Hash thing won't affect GPUs, only CPUs. And if the AMD CPUs are actually slower at it, then that's because they really suck at doing their job. If they somehow lose performance on the texture loop stuff, then whatever.

Secondly, the report doesn't make sense as Project M/Brawl aren't texture hash limited games, making me doubt the validity of the regression tests. I have an AMD GPU I test with, so I know the D3D statecache actually helps iwth the AMD video cards. And I'm fairly certain the D3D statecache can't touch CPU emulation.

Tested on NSMB with EFB to RAM at 4x and 6xIR with texture cache set to 'Safe'.

The state cache (PR #475) does give a *significant* speed boost on any card (NVIDIA and AMD). It's the commit after that one - the small 'optimizations' (PR #1414) which reduces performance on AMD cards and causes the framerate to fluctuate.

Btw those settings are crazy demanding and except efb to ram unnecessary to play the game. You don't need 6x ir unless you have a 4K or higher res monitor and texture cache can be set in the middle and it will work fine.
Pages: 1 2 3