11-19-2015, 05:08 AM
11-19-2015, 07:33 AM
(11-19-2015, 05:08 AM)DJBarry004 Wrote: [ -> ]Maybe it´s your GPU driver waiting for an update? (Make sure you have the latest one installed).
I'm running the latest driver. 358.91
11-19-2015, 12:59 PM
I've tried to fix the bug but you and one other person are the only ones with the hang so it's hard. Nobody else can reproduce the hang in my utility.
https://github.com/aserna3/DolphinBisectTool/releases/tag/v1.0.2
You can try that release, if it works, go ahead and bisect the build.
Alternatively, you can just do a bisection yourself. Take the two builds (Count 4.0.2 as zero), add them together, divide by 2. Find the closest build to that number and test it. If the bug doesn't exist in that build, take the number you have, add one to it, and make it your new working build number. If it does exist in that build, take that number, subtract one from it, and make it your new broken build number. Repeat this process until you find the build that breaks.
YET ANOTHER alternative, install visual studio 2015 and run my utility, that seems to be the only thing you two don't have. Granted, I tested my utility in a clean VM with only .NET 4.0 installed and it ran fine.
See, this is a lot of annoying work which is why I want to get my utility working for you >.>
https://github.com/aserna3/DolphinBisectTool/releases/tag/v1.0.2
You can try that release, if it works, go ahead and bisect the build.
Alternatively, you can just do a bisection yourself. Take the two builds (Count 4.0.2 as zero), add them together, divide by 2. Find the closest build to that number and test it. If the bug doesn't exist in that build, take the number you have, add one to it, and make it your new working build number. If it does exist in that build, take that number, subtract one from it, and make it your new broken build number. Repeat this process until you find the build that breaks.
YET ANOTHER alternative, install visual studio 2015 and run my utility, that seems to be the only thing you two don't have. Granted, I tested my utility in a clean VM with only .NET 4.0 installed and it ran fine.
See, this is a lot of annoying work which is why I want to get my utility working for you >.>
11-19-2015, 03:54 PM
(11-19-2015, 12:59 PM)helios747 Wrote: [ -> ]I've tried to fix the bug but you and one other person are the only ones with the hang so it's hard. Nobody else can reproduce the hang in my utility.
https://github.com/aserna3/DolphinBisectTool/releases/tag/v1.0.2
You can try that release, if it works, go ahead and bisect the build.
Alternatively, you can just do a bisection yourself. Take the two builds (Count 4.0.2 as zero), add them together, divide by 2. Find the closest build to that number and test it. If the bug doesn't exist in that build, take the number you have, add one to it, and make it your new working build number. If it does exist in that build, take that number, subtract one from it, and make it your new broken build number. Repeat this process until you find the build that breaks.
YET ANOTHER alternative, install visual studio 2015 and run my utility, that seems to be the only thing you two don't have. Granted, I tested my utility in a clean VM with only .NET 4.0 installed and it ran fine.
See, this is a lot of annoying work which is why I want to get my utility working for you >.>
Alright, I just followed your method of elimination (it reminds me of a search algorithm I learned in software development in college).
So the last version that OpenGL works great is 4.0-1776, and the next version (4.0-1778), is where OpenGL is extremely slow. I believe it may have something to do with the OGL-StreamBuffer.
11-19-2015, 04:23 PM
Thanks! Looking into this now. What game did you test by the way? It'll be helpful.
Also, just to be comprehensive can you post screenshots of your settings in the broken build?
Specifically, Config > General, Graphics > General, Graphics > Enhancements, and Graphics > Hacks
Also, just to be comprehensive can you post screenshots of your settings in the broken build?
Specifically, Config > General, Graphics > General, Graphics > Enhancements, and Graphics > Hacks
11-20-2015, 03:53 AM
(11-19-2015, 04:23 PM)helios747 Wrote: [ -> ]Thanks! Looking into this now. What game did you test by the way? It'll be helpful.
Also, just to be comprehensive can you post screenshots of your settings in the broken build?
Specifically, Config > General, Graphics > General, Graphics > Enhancements, and Graphics > Hacks
I mainly used Castlevania Judgment to test.
As soon as I was at the character select screen, that's when the framerate would drop. Possibly because that's where it starts rendering 3D models instead of simple 2D objects.
If i ever encountered slow FPS, I tried switching the FPS limit between Auto/Audio/30 to see which one had the best results.
11-20-2015, 06:49 AM
as far as I can tell your settings are pretty good. However, 99% of the time you never need to set framelimit to anything other than auto and shouldn't set it to anything else. Dolphin is very good about doing that by itself.
Additionally, not the cause of your problem, but you can leave fullscreen res to auto as well.
One last thing you can try is going into nvidia control panel, going to 3d settings, creating a per program profile for Dolphin, and specifically setting the GPU's performance setting to prefer maximum performance. This is sometimes required because since Dolphin's GPU workload is very low at stock settings, the driver's heuristics are bad at detecting what performance mode we need. This happens a lot in Super Mario Galaxy.
Anyways, this is more than enough to file a bug report if you want to at https://bugs.dolphin-emu.org/projects/emulator/issues
Thanks for nailing down this issue! Making an issue there will give it more exposure to the devs.
Additionally, not the cause of your problem, but you can leave fullscreen res to auto as well.
One last thing you can try is going into nvidia control panel, going to 3d settings, creating a per program profile for Dolphin, and specifically setting the GPU's performance setting to prefer maximum performance. This is sometimes required because since Dolphin's GPU workload is very low at stock settings, the driver's heuristics are bad at detecting what performance mode we need. This happens a lot in Super Mario Galaxy.
Anyways, this is more than enough to file a bug report if you want to at https://bugs.dolphin-emu.org/projects/emulator/issues
Thanks for nailing down this issue! Making an issue there will give it more exposure to the devs.
11-20-2015, 08:07 AM
Finally someone who is able to reproduce the OpenGL performance regression on AMD/ATI GPUs posted a year ago, but this time using an NVIDIA GPU.
NOTE #1: In that thread I benchmarked 4.0-1769 (exact same performance as 4.0-1776).
Disabling coherent mapping causes a performance drop on nearly all modern GPUs regardless of the brand (AMD or NVIDIA) or the use of vendor-specific OpenGL extensions (e.g. GL_AMD_Pinned_Memory).
Here's a quick NSMB benchmark @ 4xIR, EFB2Tex with the old HD6850 + the latest drivers (Cat. 15.11.1):
NOTE #2: Can't do a 6xIR bench because the old builds don't support anything higher than 4x.
w1-1
====
4.0-1776 OGL = 116 FPS
4.0-1778 OGL = 101 FPS
4.0-2010 OGL = 116 FPS
--------------------------
4.0-1778 D3D = 140 FPS
--------------------------
4.0-8187 OGL = 116 FPS
4.0-8187 D3D = 138 FPS
w2-overview
=========
4.0-1776 OGL = 88 FPS
4.0-1778 OGL = 74 FPS
4.0-2010 OGL = 85 FPS
-------------------------
4.0-1778 D3D = 100 FPS
--------------------------
4.0-8187 OGL = 97 FPS
4.0-8187 D3D = 126 FPS
NOTE #3: 4.0-2010 is the same as 4.0-1778, but with GL_AMD_Pinned_Memory instead of GL_ARB_Buffer_Storage.
Pinned Memory manages to close the gap, but not quite. Coherent mapping is still faster. In vertex-heavy scenes, the difference is much more pronounced (in other titles, there's a HUGE difference between coherent mapping and non-coherent with pinned memory).
Even worse, in some cases Pinned Memory can be slower than Buffer Storage:
Cake Intro
=======
4.0-1776 OGL = 75 FPS
4.0-1778 OGL = 74 FPS
4.0-2010 OGL = 72 FPS
OTOH, Coherent memory doesn't have any of these drawbacks / caveats.
NOTE #4: The latest dev. build is so much faster than the other builds because of a bunch of other optimizations (SSE-optimized vertex loaders, texture cache rewrite, more efficient JIT, etc.)
NOTE #1: In that thread I benchmarked 4.0-1769 (exact same performance as 4.0-1776).
Disabling coherent mapping causes a performance drop on nearly all modern GPUs regardless of the brand (AMD or NVIDIA) or the use of vendor-specific OpenGL extensions (e.g. GL_AMD_Pinned_Memory).
Here's a quick NSMB benchmark @ 4xIR, EFB2Tex with the old HD6850 + the latest drivers (Cat. 15.11.1):
NOTE #2: Can't do a 6xIR bench because the old builds don't support anything higher than 4x.
w1-1
====
4.0-1776 OGL = 116 FPS
4.0-1778 OGL = 101 FPS
4.0-2010 OGL = 116 FPS
--------------------------
4.0-1778 D3D = 140 FPS
--------------------------
4.0-8187 OGL = 116 FPS
4.0-8187 D3D = 138 FPS
w2-overview
=========
4.0-1776 OGL = 88 FPS
4.0-1778 OGL = 74 FPS
4.0-2010 OGL = 85 FPS
-------------------------
4.0-1778 D3D = 100 FPS
--------------------------
4.0-8187 OGL = 97 FPS
4.0-8187 D3D = 126 FPS
NOTE #3: 4.0-2010 is the same as 4.0-1778, but with GL_AMD_Pinned_Memory instead of GL_ARB_Buffer_Storage.
Pinned Memory manages to close the gap, but not quite. Coherent mapping is still faster. In vertex-heavy scenes, the difference is much more pronounced (in other titles, there's a HUGE difference between coherent mapping and non-coherent with pinned memory).
Even worse, in some cases Pinned Memory can be slower than Buffer Storage:
Cake Intro
=======
4.0-1776 OGL = 75 FPS
4.0-1778 OGL = 74 FPS
4.0-2010 OGL = 72 FPS
OTOH, Coherent memory doesn't have any of these drawbacks / caveats.
NOTE #4: The latest dev. build is so much faster than the other builds because of a bunch of other optimizations (SSE-optimized vertex loaders, texture cache rewrite, more efficient JIT, etc.)
11-21-2015, 09:59 PM
If you're wondering why devs still can't reproduce this issue - it's because they're testing on a PC with an Intel CPU.
The OP also has an AMD CPU and what's even more interesting, it's of the same K10h architecture.
These CPUs are very sensitive when it comes to memory transfer performance from/ to CPU/GPU (they're significantly slower in this area compared to Intel's CPUs).
Coherent mapping removes *a lot* of that overhead and makes these CPUs much more efficient with the OpenGL backend.
The OP also has an AMD CPU and what's even more interesting, it's of the same K10h architecture.
These CPUs are very sensitive when it comes to memory transfer performance from/ to CPU/GPU (they're significantly slower in this area compared to Intel's CPUs).
Coherent mapping removes *a lot* of that overhead and makes these CPUs much more efficient with the OpenGL backend.
11-22-2015, 03:34 AM
It's been reproducible for awhile now. Just not to the extent that OP is having problems with,