Dolphin, the GameCube and Wii emulator - Forums

Full Version: shortcut to skip EFB Access from CPU
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
any of you guys can do this please? Because Mario Galaxy is very heavy (same in HLE), and enable skip EFB relieves load on CPU stabilizing FPS in 60...

What is the point? in many moments of the game, the framerate drops to 50/45 and with you enable Skip EFB, frame rate backs to 60....

you guys can create a shortcut (controller support plz [Image: heart.gif]) to enable/disable this function???

(i know this option disable pointer function)

Thanks!

if is in wrong area, please move to correct area!!!
No one? it's impossible? Wrong area?
Are you using the OpenGL backend? You should, it makes EFB accesses a lot faster.
(03-31-2014, 04:21 AM)delroth Wrote: [ -> ]Are you using the OpenGL backend? You should, it makes EFB accesses a lot faster.

Thanks, i'm using DirectX... i'll try this!
@delroth: nope. SMG doesn't access the EFB very often, just a few times per frame.
The performance hit of EFB access is to stall for the GPU + GPU thread and this is almost equal on both backends Wink

sagaopc: it is possible, but I don't know if anyone will care :/
Hi,

I started playing with Dolphin code drop recently for fun, playing with profilers and boost perf if possible. I am a graphic/low level programmer in the video game industry, and will focus first on having a D3D11 backend faster than the OGL one Smile

I do not yet anything to pull on git, but one of the change i work on already involve a deferred EFB access mode "hack". Usually, when a graphic engine read back memory from the GPU, it will be seen every frame and do not really need a strong link to the frame it was requested. And user code usually already deal with frame latency to prevent CPU/GPU syncs anyway ( at least on X360, PS3 and PC ).

The idea is to remove the sync completely. On the Render thread, we execute requests received as fake FIFO commands and put the results in a cache. On the CPU thread, we push the fake command to the FIFO and try to match a previously done request from the previous frames in the cache to return a value directly.

I am working on this for Zelda: Skyward Sword as profiling session shown a lot of sync waste to allow PEEK in the depth buffer. If you know some other games using EFB access with visible glitches when disable, let me know, i will look at the access pattern to see if the hack i am working on is also viable for them.
Check out F-Zero GX on the sand ocean tracks; Monster Hunter Tri (pretty much anywhere in game, but character creation works) etc. If it's a hack that doesn't get in the way of working code, and is well written, and works for most, if not all games, you at least could have a puncher's chance in having it merged (maybe as an option), assuming this isn't some huge hack that changes everything.

Then again, I'm not a developer, so I can't really tell you how viable it is to get support, but it at least sounds interesting to me as a user.
@galop1n: Hey! I'm not sure how I feel about your "deferred EFB access" idea, it sounds pretty much like something which has bad chances of being integrated in the main Dolphin tree (it'd just be another just-sort-of-useful option in the graphics dialog). You're free to try it out anyways of course, just making sure there's no bad surprise if we indeed happen to turn down an eventual PR of yours.
Fyi, if your goal is to speed up EFB accesses in D3D11, have you looked into OpenGL's EFB access cache, yet? D3D would likely benefit a lot from such a cache, and moving the cache from OpenGL-specific code to VideoCommon has been on our TODO list for a while. There just has never been anyone with spare time and motivation to work on it though.
The EFB cache in the OGL backend is useful to not multiply GL API to query individual pixel, something that is unlikely to be efficient compared to reading back a full rectangle. But it still needs to flush the FIFO and worse it flush the driver command buffer ( that usually is rendering a frame behind ).

Instead if requests result are pushed in a Buffer on the GPU side ( like with a small compute shader writing to an append buffer ) and we only Map that buffer the next frame, we prevent the Driver to flush itself too as we are likely to have pass the fence protecting it already. Of course, that is something that we are more likely to do directly in a graphic engine in modern games, but depending of why a game read back a rendered texture, the hack can have a virtual free penalty compared to not doing EFB access at all without breaking ( or not ) the game feature using it.

Of course, that's only my debut with the code base, i still miss the global picture. Anyway, i won 4ms on the press start screen of "OnePiece:UA" only with basic cleaning in Renderer::ApplyState Smile
Yeah you're basically right, just that there's no easy way to implement what you're proposing without sacrifizing accuracy. The value returned by AccessEFB might be used as early as on the next CPU instruction, so if you aren't making sure the data is ready by then, things will go wrong. Of course, it might work for one game or the other, but ultimately we don't want such assumptions since sooner or later they turn out to be a PITA.
Pages: 1 2