Dolphin, the GameCube and Wii emulator - Forums

Full Version: Benchmark results of Build Time optimisation flags + LTO
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

sheepdestroyer

Hi, first post here for me

I did a small scale benchmarking of diffrent compile flags with git from October 2nd.
My system is a SandyBridge core i7 2620 dual core + HT so 4 threads ; I run fedora 20 64bit + gcc 4.9.1 from rawhide

Standard build is -O3, targets general arch x64 (I think?) and without LTO.
I was interested in finding out what would be the effect of -Ofast instead of -O3, -march=native (equivalent to -march=corei7-avx on my cpu) and LTO with recent gcc.

I used the povray.elf benchmark found on this forum

---------------------------------------Run1------ ------Run2--------------Run3
Standard----------------------------15min 43s------- 15min 43s------- 15min 44s
-Ofast-------------------------------15min 43s--------15min 42s
march=native----------------------15min 42s--------15min 44s
march=native & -Ofast------------15min 44s -------15min 44s
LTO----------------------------------15min 42s--------15min 42s
LTO & -Ofast------------------------15min 43s--------15min 44s
LTO & march=native---------------15min 44s--------15min 42s
LTO & march=native & -Ofast-----15min 42s--------15min 42s

As you can see, this is quite unconcluant and I would be tempted to think that neither optimization flags nor LTO matter.
But I only tried one synthethic benchmark and would like your advice on how to bench real games in a reproducible way.
Thanks for your comments
Interesting. Yeah, no real change, though.
Yeah, optimizations like this in the past were shown to have no noticeable speed increases in Dolphin. Looks like that hasn't changed.

If the OP is interested at benchmarking commercial games, Dolphin can create "movies" that you can use to replay games in an exact manner, iirc. Dolphin can also record the FPS to a file, so you could recreate each run and compare the result instead of eye-balling it.

sheepdestroyer

Yes, compile flags do not usually bring much (still fun to bench from time to time) but LTO is a quite new optimisation for Dolphin that was recently added to cmakelist.
Furthermore GCC 4.9 had a focus on enhencing LTO support so i was hopping some results on this front.
I heard that there are new LTO optimisations intended to be enabled in GCC 5 so I may retry again when it become testable.

regarding replay of portion of a game, i was wndering if TAS support could like register a list of inputs and replay them in order to have an exact replay of the same sequence. I did not investigate that further, there may already be an answer on this forum...
LTO isn't going to make any difference when the majority of code being executed is dynamically generated.
(10-12-2014, 01:37 AM)sheepdestroyer Wrote: [ -> ]regarding replay of portion of a game, i was wndering if TAS support could like register a list of inputs and replay them in order to have an exact replay of the same sequence. I did not investigate that further, there may already be an answer on this forum...

TAS would work, but it's really much easier to just go to Emulation -> Start Recording and Play Recording. It takes care of all input if I remember correctly, so you just play and use the .dtm file that Dolphin generates, then compare FPS between different GCC options.

By like tueidj, I'd expect a lot of Dolphin's grunt work to come from dynamically recompiling code. Optimizations to the code Dolphin generates would result in noticeable speedups, like the kind of commits from Fiora we've been getting recently. For fun, perhaps you could see if anything changes when using the interpreter (for both the CPU and DSP).