Call for testers!
I'm in the middle of a rather large refactoring effort concerning the GC/Wii texture decoders and the code responsible for uploading decoded textures to the graphics card. I'm at a point now where I need testers to let me know how this affects speed in whatever games. I could test it all myself, but I don't have enough games or enough time to do it.
WHERE:
I'm using github so I can keep track of my own changes independent of the SVN repo and also so that I don't break anything for anyone doing other work in the SVN trunk.
For those of you who can compile a Dolphin build on your own, please clone this git repository from github: https://github.com/JamesDunne/Dolphin , build it and test some of your games for me. It's a nearly identical copy of what's in SVN so there should be no surprises if you're used to building off SVN.
WHAT:
This is an entirely experimental branch which focuses on optimizing the texture decode and upload procedure. The existing code in SVN is wasteful in that it:

For OpenGL I'm going to look into using PBOs (pixel buffer objects) to make use of DMA transfers instead of CPU-bound copies to upload textures to the card. As of now in SVN this is being done with a normal glTexSubImage2D call which does use CPU-bound copies and forces the emulator to wait until the texture is uploaded which is not necessary to do. The emulator could be doing more useful things while waiting for some stupid texture to upload to the card. Fixing this to use PBOs will cause a significant performance increase for the OGL plugin and should hopefully bring Linux up to speed with Windows and DX9.
DX11 is last on my priority list since it should be a straightforward conversion of the DX9 code, I hope. I've disabled the SSE2 and SSSE3 optimizations that myself and Xsacha worked so hard on in the current SVN over the last few weeks. These need significant work in order to be compatible with decoding directly to BGRA instead of RGBA as they are now coded to do. The bulk of the work will be in fixing those optimized decoders to safely write texels near the texture boundaries. Since we're not using a temporary buffer anymore, we don't have the luxury of nice 8-texel aligned textures. The DX/OGL texture cannot be padded nor cropped quickly, so we must modify the decoders to accommodate this change.
Key testing configuration:
I'm in the middle of a rather large refactoring effort concerning the GC/Wii texture decoders and the code responsible for uploading decoded textures to the graphics card. I'm at a point now where I need testers to let me know how this affects speed in whatever games. I could test it all myself, but I don't have enough games or enough time to do it.
WHERE:
I'm using github so I can keep track of my own changes independent of the SVN repo and also so that I don't break anything for anyone doing other work in the SVN trunk.
For those of you who can compile a Dolphin build on your own, please clone this git repository from github: https://github.com/JamesDunne/Dolphin , build it and test some of your games for me. It's a nearly identical copy of what's in SVN so there should be no surprises if you're used to building off SVN.
WHAT:
This is an entirely experimental branch which focuses on optimizing the texture decode and upload procedure. The existing code in SVN is wasteful in that it:
- decodes the GC/Wii texture onto a temporary memory buffer in RGBA format
- locks the DX9 texture memory
- converts RGBA to BGRA on-the-fly while writing to the DX9 texture memory
- unlocks the DX9 texture memory which initiates an upload to the graphics card

For OpenGL I'm going to look into using PBOs (pixel buffer objects) to make use of DMA transfers instead of CPU-bound copies to upload textures to the card. As of now in SVN this is being done with a normal glTexSubImage2D call which does use CPU-bound copies and forces the emulator to wait until the texture is uploaded which is not necessary to do. The emulator could be doing more useful things while waiting for some stupid texture to upload to the card. Fixing this to use PBOs will cause a significant performance increase for the OGL plugin and should hopefully bring Linux up to speed with Windows and DX9.
DX11 is last on my priority list since it should be a straightforward conversion of the DX9 code, I hope. I've disabled the SSE2 and SSSE3 optimizations that myself and Xsacha worked so hard on in the current SVN over the last few weeks. These need significant work in order to be compatible with decoding directly to BGRA instead of RGBA as they are now coded to do. The bulk of the work will be in fixing those optimized decoders to safely write texels near the texture boundaries. Since we're not using a temporary buffer anymore, we don't have the luxury of nice 8-texel aligned textures. The DX/OGL texture cannot be padded nor cropped quickly, so we must modify the decoders to accommodate this change.
Key testing configuration:
- Use DX9 plug-in only. DX11 and OGL *will* break.
- Disable OpenCL.