• Login
  • Register
  • Dolphin Forums
  • Home
  • FAQ
  • Download
  • Wiki
  • Code


Dolphin, the GameCube and Wii emulator - Forums › Dolphin Emulator Discussion and Support › Development Discussion v
« Previous 1 ... 66 67 68 69 70 ... 116 Next »

Testers needed for refactoring of texture decode/upload code
View New Posts | View Today's Posts

Pages (7): 1 2 3 4 5 ... 7 Next »
Jump to page 
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Thread Modes
Testers needed for refactoring of texture decode/upload code
01-14-2011, 04:00 PM
#1
JamesDunne Offline
Junior Member
**
Posts: 21
Threads: 1
Joined: Jan 2011
Call for testers!
I'm in the middle of a rather large refactoring effort concerning the GC/Wii texture decoders and the code responsible for uploading decoded textures to the graphics card. I'm at a point now where I need testers to let me know how this affects speed in whatever games. I could test it all myself, but I don't have enough games or enough time to do it.

WHERE:
I'm using github so I can keep track of my own changes independent of the SVN repo and also so that I don't break anything for anyone doing other work in the SVN trunk.

For those of you who can compile a Dolphin build on your own, please clone this git repository from github: https://github.com/JamesDunne/Dolphin , build it and test some of your games for me. It's a nearly identical copy of what's in SVN so there should be no surprises if you're used to building off SVN.

WHAT:
This is an entirely experimental branch which focuses on optimizing the texture decode and upload procedure. The existing code in SVN is wasteful in that it:
  1. decodes the GC/Wii texture onto a temporary memory buffer in RGBA format
  2. locks the DX9 texture memory
  3. converts RGBA to BGRA on-the-fly while writing to the DX9 texture memory
  4. unlocks the DX9 texture memory which initiates an upload to the graphics card
This new branch is testing the idea of removing the temporary memory buffer and the unnecessary RGBA -> BGRA format conversion. We now decode into BGRA format directly into the DX9 texture memory with no intermediate copies needed. I'll have to check DX9 documentation but I hope that DMA transfers are being used to upload to the graphics card in this case. If not, then maybe that's a DX11 thing to investigate or maybe it's a different call in DX9. I'm not experienced in DirectX technology so I don't know yet, but I'll find out Smile

For OpenGL I'm going to look into using PBOs (pixel buffer objects) to make use of DMA transfers instead of CPU-bound copies to upload textures to the card. As of now in SVN this is being done with a normal glTexSubImage2D call which does use CPU-bound copies and forces the emulator to wait until the texture is uploaded which is not necessary to do. The emulator could be doing more useful things while waiting for some stupid texture to upload to the card. Fixing this to use PBOs will cause a significant performance increase for the OGL plugin and should hopefully bring Linux up to speed with Windows and DX9.

DX11 is last on my priority list since it should be a straightforward conversion of the DX9 code, I hope. I've disabled the SSE2 and SSSE3 optimizations that myself and Xsacha worked so hard on in the current SVN over the last few weeks. These need significant work in order to be compatible with decoding directly to BGRA instead of RGBA as they are now coded to do. The bulk of the work will be in fixing those optimized decoders to safely write texels near the texture boundaries. Since we're not using a temporary buffer anymore, we don't have the luxury of nice 8-texel aligned textures. The DX/OGL texture cannot be padded nor cropped quickly, so we must modify the decoders to accommodate this change.

Key testing configuration:
  • Use DX9 plug-in only. DX11 and OGL *will* break.
  • Disable OpenCL.
If someone could post a binary distributable of commit 4faf3aa, that'd be fantastic. I don't have time to do it right now, but I will consider it later if I don't get any significant feedback within a day or two.
Find
Reply
01-14-2011, 04:08 PM
#2
emu-muncher Offline
Member
***
Posts: 121
Threads: 1
Joined: Jun 2010
Will test if someone can upload a build.
Intel core i5 750 @ 3.2 GHz / AMD 5850 / 4GB RAM @ 1600 MHz
Windows 7 x64
Find
Reply
01-14-2011, 04:10 PM
#3
Squall Leonhart Offline
Friend of local jackass
*******
Posts: 1,177
Threads: 27
Joined: Apr 2009
Quote:DX11 is last on my priority list since it should be a straightforward conversion of the DX9 code, I hope.

Please don't approach Dx11 like this. You will kill most of the gains that it provides if you think its as simple as converting the code. Many half wit Dx11 developers are doing conversions which maintain Dx9 contexts, this can incur up to a 40% penalty in performance, compared to native dx11 style contexts and creation.
[Image: squall_sig2.gif]
[Image: squall4rinoa.png]
VBA-M
Website Find
Reply
01-14-2011, 04:40 PM
#4
JamesDunne Offline
Junior Member
**
Posts: 21
Threads: 1
Joined: Jan 2011
(01-14-2011, 04:10 PM)Squall Leonhart Wrote: Please don't approach Dx11 like this. You will kill most of the gains that it provides if you think its as simple as converting the code. Many half wit Dx11 developers are doing conversions which maintain Dx9 contexts, this can incur up to a 40% penalty in performance, compared to native dx11 style contexts and creation.
Yeah, I'll check the documentation on DX11 to see how to best accomplish the task. I was assuming it would look very similar to DX9. Guess it won't be.
Find
Reply
01-14-2011, 04:42 PM (This post was last modified: 01-14-2011, 04:43 PM by NaturalViolence.)
#5
NaturalViolence Offline
It's not that I hate people, I just hate stupid people
*******
Posts: 9,013
Threads: 24
Joined: Oct 2009
+1 For a dolphin dev taking code branches seriously. We need more people like this testing experimental code BEFORE commiting it to the svn (fifo anyone?). I'll join in when I have time on the weekends.
"Normally if given a choice between doing something and nothing, I’d choose to do nothing. But I would do something if it helps someone else do nothing. I’d work all night if it meant nothing got done."  
-Ron Swanson

"I shall be a good politician, even if it kills me. Or if it kills anyone else for that matter. "
-Mark Antony
Website Find
Reply
01-14-2011, 06:20 PM
#6
Link_to_the_past Offline
Link on steroids really
*******
Posts: 1,767
Threads: 17
Joined: Feb 2010
After testing some gpu heavy games using high efb scale with direct3d9 plugin i didn't notice any speedup(Mario Kart wii, Super Mario Galaxy), instead i noticed texture errors.
Below is an x64 build for anyone interested:
Test x64 build
Find
Reply
01-14-2011, 07:52 PM
#7
emu-muncher Offline
Member
***
Posts: 121
Threads: 1
Joined: Jun 2010
Same results here. No real difference in speed but some bugs in some games.

example:
[Image: smgny.th.jpg]
Intel core i5 750 @ 3.2 GHz / AMD 5850 / 4GB RAM @ 1600 MHz
Windows 7 x64
Find
Reply
01-14-2011, 07:59 PM
#8
Squall Leonhart Offline
Friend of local jackass
*******
Posts: 1,177
Threads: 27
Joined: Apr 2009
those are most definitely efb regions not being rendered correctly.

Zelda TP does the blob thing as well if you mess about with the safe cache settings on trunk.
[Image: squall_sig2.gif]
[Image: squall4rinoa.png]
VBA-M
Website Find
Reply
01-14-2011, 10:52 PM (This post was last modified: 01-14-2011, 10:56 PM by KarstenS.)
#9
KarstenS Offline
Member
***
Posts: 125
Threads: 7
Joined: Jan 2010
(01-14-2011, 04:00 PM)JamesDunne Wrote: For those of you who can compile a Dolphin build on your own, please clone this git repository from github: https://github.com/JamesDunne/Dolphin , build it and test some of your games for me. It's a nearly identical copy of what's in SVN so there should be no surprises if you're used to building off SVN.

I'll also compile and test. Hope you write here something when code got updated.

Question: Why not opening a branch directly in the project at Google Code?


(01-14-2011, 04:00 PM)JamesDunne Wrote: We now decode into BGRA format directly into the DX9 texture memory with no intermediate copies needed.

I hope this will work well. That could also open the door to make use of more OpenCL stuff.
Find
Reply
01-15-2011, 02:11 AM (This post was last modified: 01-15-2011, 02:17 AM by JamesDunne.)
#10
JamesDunne Offline
Junior Member
**
Posts: 21
Threads: 1
Joined: Jan 2011
(01-14-2011, 10:52 PM)KarstenS Wrote: I'll also compile and test. Hope you write here something when code got updated.

Question: Why not opening a branch directly in the project at Google Code?
Because GIT >>>> SVN. Git lets me commit locally and work offline and rearrange things and revert previous work, etc. SVN requires me to be connected if I want to commit changes. Also, much less of an issue, but nobody else seems to have any branches going on in the SVN repo so I don't wanna be "that guy" that has his own.
(01-14-2011, 10:52 PM)KarstenS Wrote: I hope this will work well. That could also open the door to make use of more OpenCL stuff.
How are you thinking it would help OpenCL? Seems to me we could just decode direct to texture memory with OpenCL in BGRA format and remove the unnecessary back-and-forth trips from GPU to CPU. Although I honestly don't have any OpenCL experience to talk about it in any amount of detail.
(01-14-2011, 07:52 PM)emu-muncher Wrote: Same results here. No real difference in speed but some bugs in some games.

example:
[Image: smgny.th.jpg]
Interesting screen shot. I didn't touch any EFB work. Just texture decoders and upload. They should all be working fine. I did a play-through of MKWii just to be sure since I believe it makes use of all the texture formats.

A lack of speed improvement is actually a good thing in this case. I removed all the SSE2 and SSSE3 optimizations in favor of a much less redundant texture upload scenario. So, in theory, if it produces equivalent performance w/o the optimizations that might mean it'll be a net gain if I go back and patch up the SSE2 and SSSE3 stuff to work this new way. Who knows, it's a draw at the moment. I would figure anything else I do should allegedly be a win. Smile
(01-14-2011, 04:42 PM)NaturalViolence Wrote: +1 For a dolphin dev taking code branches seriously. We need more people like this testing experimental code BEFORE commiting it to the svn (fifo anyone?). I'll join in when I have time on the weekends.
I'm pretty sure the other devs take their work seriously Smile. I just knew this would be a large breaking change and I didn't want to upset anyone while I was working on it. Seemed like the logical thing to do.

Also, I'm sorta the new guy so I'd like to hold off any major fluff-ups until at least my second month "on the job." lol
Find
Reply
« Next Oldest | Next Newest »
Pages (7): 1 2 3 4 5 ... 7 Next »
Jump to page 


  • View a Printable Version
  • Subscribe to this thread
Forum Jump:


Users browsing this thread: 1 Guest(s)



Powered By MyBB | Theme by Fragma

Linear Mode
Threaded Mode