• Login
  • Register
  • Dolphin Forums
  • Home
  • FAQ
  • Download
  • Wiki
  • Code


Dolphin, the GameCube and Wii emulator - Forums › Dolphin Emulator Discussion and Support › Development Discussion v
« Previous 1 ... 38 39 40 41 42 ... 116 Next »

OpenGL performance regression on (ATI/AMD) Radeon GPUs since 4.0-1778
View New Posts | View Today's Posts

Pages (3): 1 2 3 Next »
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Thread Modes
OpenGL performance regression on (ATI/AMD) Radeon GPUs since 4.0-1778
11-02-2014, 11:34 AM (This post was last modified: 02-05-2015, 08:34 AM by kirbypuff.)
#1
kirbypuff Offline
The Original White Marshmallow
*****
Posts: 825
Threads: 37
Joined: Aug 2010
Quote from the July 2014 progress report:
Quote:Fix AMD Performance Regressions with Coherent Mapping Off

This one is a little bit awkward. Last month, a fix was merged to address a 30% performance regression on certain older NVIDIA GPUs due to coherent mapping being enabled needlessly. Turning it off caused no performance hit on newer NVIDIA video cards and sped up the older ones considerably. This seemed like a win-win situation, and everyone shook hands over a job well done.

Unfortunately, no one bothered to test AMD GPUs, because there is really no logical reason why Coherent Mapping would make them go faster, right? It turns out that disabling coherent mapping caused a 30% performance regression to AMD video cards now. Not wanting to add GPU specific code, Sonicadvance1 and degasus took some time to try to figure out a solution to make all video cards happy.

In the end, they determined that AMD's Pinned Memory extension for buffer streaming did not suffer the slowdown, and was even faster than Buffer Storage for AMD graphics cards. This simple change results in up to an 8% speed increase on AMD graphics cards compared to before the performance regression! Because NVIDIA cards do not even have the pinned memory extension, there is no plausible way this could make them slower again.

Unfortunately, the "compromise solution" introduced in 4.0-2010 did not completely fix the performance regression on ATI/AMD GPUs. It's still *way* slower (up to 30%) compared to build 4.0-1769 (the last build that had coherent mapping enabled before the 4.0-1778 "NVIDIA fix")
Find
Reply
11-02-2014, 11:50 AM
#2
zephyrsurfer Offline
I'm a snake
***
Posts: 87
Threads: 6
Joined: Aug 2014
(11-02-2014, 11:34 AM)kirbypuff Wrote: It's still *way* slower (up to 30%) compared to build 4.0-1769 (the last build that had the AMD_Pinned_Memory OpenGL extension enabled before the 4.0-1778 "NVIDIA fix")

Hey.
What are you using for testing this with. Can I see some numbers?; I have an AMD card as well so I could give some support on your issue or whatever.
Find
Reply
11-02-2014, 12:06 PM (This post was last modified: 11-04-2014, 06:03 PM by kirbypuff.)
#3
kirbypuff Offline
The Original White Marshmallow
*****
Posts: 825
Threads: 37
Joined: Aug 2010
Tested on HD5xxx, HD6xxx , HD7xxx and the latest Rx 2xx series with both older drivers and the very latest drivers. It affects *alll* ATI/AMD cards.
Find
Reply
11-02-2014, 12:08 PM (This post was last modified: 11-02-2014, 12:09 PM by zephyrsurfer.)
#4
zephyrsurfer Offline
I'm a snake
***
Posts: 87
Threads: 6
Joined: Aug 2014
(11-02-2014, 12:06 PM)kirbypuff Wrote: Tested on HD5xxx, HD6xxx , HD7xxx and the latest Rx 2xx series with both older drivers and the very the latest drivers. It affects *alll* ATI/AMD cards.

Yeah but what games did you test on?

Edit: what game is a worse case?
Find
Reply
11-02-2014, 12:17 PM
#5
MayImilae Online
Chronically Distracted
**********
Administrators
Posts: 4,572
Threads: 119
Joined: Mar 2011
kirbypuff Wrote:Unfortunately, the fix or "compromise solution" introduced in 4.0-2010 did not improve performance much or fix the performance regression on ATI/AMD GPUs. It's still *way* slower (up to 30%) compared to build 4.0-1769 (the last build that had the AMD_Pinned_Memory OpenGL extension enabled before the 4.0-1778 "NVIDIA fix")

You are making it out as though AMD_Pinned_Memory was canned, when the exact opposite is true. When Buffer_Storage was introduced, AMD_Pinned_Memory was ignored as the devs considered it more or less a hack to do what Buffer_Storage does. But with coherent mapping being a problem on Nvidia but needed for AMD, the devs realized they could just keep coherent mapping off and use AMD_Pinned_Memory on AMD chips and presto, both Nvidia and AMD get the best solutions, no "compromise solution" (which I might add was not in the article, despite your quotation implying it was) necessary. If you are using the latest dev build on an AMD card, you are using AMD_Pinned_Memory. It was tested by multiple AMD users.

It sounds like there is something wrong on your end. Please give your operating system, system specs, dolphin version, and driver version please.
[Image: RPvlSEt.png]
AMD Threadripper Pro 5975WX PBO+200 | Asrock WRX80 Creator | NVIDIA GeForce RTX 4090 FE | 64GB DDR4-3600 Octo-Channel | Windows 11 22H2
MacBook Pro 14in | M1 Max (32 GPU Cores) | 64GB LPDDR5 6400 | macOS 12
Find
Reply
11-02-2014, 12:20 PM (This post was last modified: 11-02-2014, 12:20 PM by JMC47.)
#6
JMC47 Offline
Content Producer
*******
Content Creators (Moderators)
Posts: 6,542
Threads: 29
Joined: Feb 2013
You do realize we re-enabled Pinned Memory over buffer storage for AMDs, right? Like seriously, you're just wrong, I tested it on my 5850 in polygon heavy games.
Find
Reply
11-02-2014, 12:24 PM (This post was last modified: 11-04-2014, 06:15 PM by kirbypuff.)
#7
kirbypuff Offline
The Original White Marshmallow
*****
Posts: 825
Threads: 37
Joined: Aug 2010
(11-02-2014, 11:50 AM)zephyrsurfer Wrote: Hey.
What are you using for testing this with. Can I see some numbers?; I have an AMD card as well so I could give some support on your issue or whatever.

Try the usual SMG1_observatory_"benchmark" on a fully complete save (with 2x 121 stars) at 4xIR, EFB to Texture, leaving the main character standing for about 3 minutes (until he sleeps) and then recording the avg. framerate.

Just benchmark and compare the results of 4.0-1769, 4.0-1778, 4.0-2010 and the latest master build (4.0-3966).

NOTE: There is also another issue - inconsistent performance between runs (see my other thread in the dev forum), so you need to do at least 4 benchmark runs with each build (without closing Dolphin).
Find
Reply
11-02-2014, 12:31 PM
#8
JMC47 Offline
Content Producer
*******
Content Creators (Moderators)
Posts: 6,542
Threads: 29
Joined: Feb 2013
Yes, I know that there is inconsistent performance on that benchmark, which is why I was extremely careful when benchmarking it. Anyway, if you're having an issue, maybe something about the stupid coherent mapping is coming on. I don't think it's pinned memory, though. It could be that AMD's drivers have changed since we did that benchmark.
Find
Reply
11-02-2014, 12:40 PM (This post was last modified: 11-04-2014, 06:08 PM by kirbypuff.)
#9
kirbypuff Offline
The Original White Marshmallow
*****
Posts: 825
Threads: 37
Joined: Aug 2010
(11-02-2014, 12:31 PM)JMC47 Wrote: Yes, I know that there is inconsistent performance on that benchmark, which is why I was extremely careful when benchmarking it. Anyway, if you're having an issue, maybe something about the stupid coherent mapping is coming on. I don't think it's pinned memory, though. It could be that AMD's drivers have changed since we did that benchmark.

It's definitely a coherent mapping issue - massive slowdown since the coherent mapping removal in 4.0-1778

Tested with the latest stable drivers (14.4 WHQL, 14.9 WHQL) and the latest / best OpenGL driver for Radeon - the OpenCL2 Beta.
Find
Reply
11-02-2014, 12:41 PM
#10
JMC47 Offline
Content Producer
*******
Content Creators (Moderators)
Posts: 6,542
Threads: 29
Joined: Feb 2013
Yeah, but we don't use coherent mapping any more on AMD; we use Pinned Memory now. Unless the newer drivers are totally backwards, there's no way that could be the issue at all.
Find
Reply
« Next Oldest | Next Newest »
Pages (3): 1 2 3 Next »


  • View a Printable Version
  • Subscribe to this thread
Forum Jump:


Users browsing this thread: 1 Guest(s)



Powered By MyBB | Theme by Fragma

Linear Mode
Threaded Mode