Dolphin, the GameCube and Wii emulator - Forums
What is more optimized?SSSE3 or SSE4.2? - Printable Version

+- Dolphin, the GameCube and Wii emulator - Forums (https://forums.dolphin-emu.org)
+-- Forum: Dolphin Emulator Discussion and Support (https://forums.dolphin-emu.org/Forum-dolphin-emulator-discussion-and-support)
+--- Forum: General Discussion (https://forums.dolphin-emu.org/Forum-general-discussion)
+--- Thread: What is more optimized?SSSE3 or SSE4.2? (/Thread-what-is-more-optimized-ssse3-or-sse4-2)

Pages: 1 2


What is more optimized?SSSE3 or SSE4.2? - Manaplayer - 01-28-2011

Hi!

I'm using the newest Lectrode's optimized Dolphin builds and like the threads name i'm asking what is more optimized?The list order of the builds are that SSE4.2 is over SSSE3.
I was asking Google but it's hard to find an answer-That SSE4.2 is more optimized than SSE4.1 or SSE3 is clearly self-answering.

Please don't hit me so much for my english, i'm german...Blush


RE: What is more optimized?SSSE3 or SSE4.2? - Kodiack - 01-28-2011

SSE is a series of instruction sets that gets built upon every so often. Processors capable of SSE 4.2 can do everything in the SSE 4.2 instruction set and earlier.

To answer your question, if it was written well, SSE 4.2-optimized code would have the SSE 3 optimizations available as well. As you're running a Core i7 CPU, that is what I would recommend using. Note that you likely won't see a notable performance increase over SSE 3, though


RE: What is more optimized?SSSE3 or SSE4.2? - kernel64 - 01-28-2011

So you really need to get the build which is optimized for your particular CPU. In your case it should be a SSE4.2 build for your core i7 2600K.

Since Sandy Bridge and newer AMD cpus are now using AVX instructions, maybe we'll see soon some builds optimized for this extensions.


RE: What is more optimized?SSSE3 or SSE4.2? - Manaplayer - 01-28-2011

Thx for your answers,but i was asking about ss-S-3.The other ones only have two "S",but what about the one with three "S"?


RE: What is more optimized?SSSE3 or SSE4.2? - NaturalViolence - 01-28-2011

Ask yourself this. Which number is higher? Is 4.2 > 3?

SSE = streaming simd (single instruction multiple data, a.k.a. a vector) extension
SSSE = STACKLESS streaming simd extension (includes instructions that use stackless registers)

As was stated earlier sse4.2 is an EXTENSION. Therefore a build compiled for sse4.2 is compiled with instrusions from earlier sse versions (including ssse3) as well as the new instructions included in the extension.


RE: What is more optimized?SSSE3 or SSE4.2? - Manaplayer - 01-28-2011

Ok,that is the final answer.I couldn't find the difference between SSx and SSSx.That 4 is better than 3 was already clear Tongue
Thx for the answers Smile


RE: What is more optimized?SSSE3 or SSE4.2? - Squall Leonhart - 01-29-2011

(01-28-2011, 08:48 AM)NaturalViolence Wrote: SSSE = STACKLESS streaming simd extension (includes instructions that use stackless registers)

SSSE=Supplemental Streaming Simd

Quote:That 4 is better than 3 was already clear

Not entirely the case.


RE: What is more optimized?SSSE3 or SSE4.2? - boogerlad - 01-29-2011

each sse version has different instructions. One may not be better than another. However, a higher sse version will have all of the older instructions too.


RE: What is more optimized?SSSE3 or SSE4.2? - NaturalViolence - 01-29-2011

@Squall

I stand corrected.

Anyways how could software compiled with the ssse3 flag possibly be any faster than if it's compiled with the sse4.2 flag? Wouldn't it still use ssse3 instructions when appropriate?


RE: What is more optimized?SSSE3 or SSE4.2? - Squall Leonhart - 01-29-2011

(01-29-2011, 05:43 AM)boogerlad Wrote: each sse version has different instructions. One may not be better than another. However, a higher sse version will have all of the older instructions too.

NOT TRUE AT ALL.

Quote:Anyways how could software compiled with the ssse3 flag possibly be any faster than if it's compiled with the sse4.2 flag? Wouldn't it still use ssse3 instructions when appropriate?

In an ideal case, it should, unfortunately its not always the case. there are some functions in sse4 that do the same thing but take longer than doing it with sse3

further, theres the fact that AMD bulldozer chips will support AVX but not SSSE3/SSE4 (not that AVX performs well by mixing instruction sets anyway)