Dolphin, the GameCube and Wii emulator - Forums

Full Version: VideoCommon > Code with SSSE3/SSE4.1 intrinsic functions
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
(02-23-2010, 01:40 AM)ector Wrote: [ -> ]Nice stuff. If bSSSE3 and SSE4 don't work, they should be fixed - could just set them using your code. If you do that, i'll give you commit access so you can submit it, just pm me your gmail/google code account name and I'll add you.

I sent a private message to you.
After that, I found and understood why bSSSE3 and bSSE4 does not work. It is because cpu_info.Detect() is called only in Dolphin.exe's main() and is not called in Video_**.dll's main.cpp. So they have not been set when I tried to use bSSSE3 and bSSE4. I will commit another patch which fixes the issue after I get commit access.
(02-24-2010, 01:01 AM)nodchip Wrote: [ -> ]
(02-23-2010, 01:40 AM)ector Wrote: [ -> ]Nice stuff. If bSSSE3 and SSE4 don't work, they should be fixed - could just set them using your code. If you do that, i'll give you commit access so you can submit it, just pm me your gmail/google code account name and I'll add you.

I sent a private message to you.
After that, I found and understood why bSSSE3 and bSSE4 does not work. It is because cpu_info.Detect() is called only in Dolphin.exe's main() and is not called in Video_**.dll's main.cpp. So they have not been set when I tried to use bSSSE3 and bSSE4. I will commit another patch which fixes the issue after I get commit access.

You've been added.

BTW for the vertex loader optimizations, your method adds one extra if () to the fast path. This could be avoided if you used a special Pos_ReadIndex16_Float3_SSE4 for example, and had that compiled into the vertex loader sequence instead, if SSE4 is available.
Off-topic: Wow I'm impressed regarding your coding skills, have you been involved in any other emulation projects or projects that required coding?

Keep up the good work, even if it only nets a 1% - 3% improvement it is still better than nothing. Smile
I committed the code. I also commit code to fix the issue that cpu_info is not initialized in plugin.
Next time, I will modify the code to remove extra if().
Thanks for your work, nodchip !

Any measurable performance gain without losing compatibility is excellent. You should search for similar possibilities in the code.
(02-21-2010, 03:31 AM)Xtreme2damax Wrote: [ -> ]Anything that improves performance is good as far as I'm concerned so long as accurate emulation isn't sacrificed in the process. From what I've observed Dolphin has a large bottleneck with video/graphics emulation, so it could do Dolphin a world of good if someone was willing to optimize the vertex loaders and implement vertex caching.

Xtreme2damax is dreaming of a fullspeed Hyrule Field gain ^^

Good idea...I'm sure something nice is going to come out of this if it gets the change to ripen a bit Smile
Indeed, I did observe something regarding the Hyrule Field slowdown unrelated to this patch:

http://forums.dolphin-emu.org/thread-7495.html
Is there any way to have just a SSSE3 plugin? I have a Q6600 which has no SSE4.1 support, only SSSE3. When I try to use those plugins dolphin crashes.

sickofit

im interested in a SSSE3 only patch, too.

does this patch slow me down if i do not support SSE4.1 or is there a speed gain for me anyway?

does this patch work with newer revisions also? cause its named ***rev5089.patch...

thank you so much for your help!
possible to get SSE4.A support? (amd version of sse4)
Pages: 1 2 3