Dolphin, the GameCube and Wii emulator - Forums

Full Version: Android Optimisation research
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

thelittlefireman

Hi everyone Smile

I know i'm new on this forum. I'm an android developer and i would like to make some research for speed up android Dolphin version.

First of all thanks for your help Wink

I wonder :

- What is the most bottleneck on Android version GPU or CPU ?

- After looking into the dolphin source code, i didn't see any optimization for NEON instruction for android : Is there, in the past of dolphin, some tests/research on that point ?

- Could it be useful if i try to convert some CPU Wii/GameCube instruction to NEON optimized instruction, or it's totally useless ?

Useful links :

Automatic vectorization for compiling C/C++ source for android :
http://infocenter.arm.com/help/index.jsp...04s03.html

A useful library to help for calling NEON optimized function:
https://github.com/ARM-software/ComputeLibrary

NEON example :
https://github.com/googlesamples/android...hello-neon

Thanks for your help Smile

Sorry for my bad English Confused
May I ask you to join us on IRC #dolphin-emu on freenode? Most development chats are there.

The bottleneck depends on the used game and device:
We are often limited by the CPU overhead of the GPU driver, especially on buffer management on Mali.
Some games require to readback the framebuffer, here we're limited by the GPU latency. Drivers often detect this as low usage and switch to low power mode, which makes this even worse.
We are close to never limited by the GPU's shading performance, at least if you don't upscale the graphics ...
*But* nobody in the current development team is willing to donate their time on providing workarounds for bad GPU drivers. Dolphin runs quite well on the Nvidia Android TV, being fully bottlenecked by the CPU emulation.

Emulation is by definition bottlenecked by the CPU emulation (but on terrible drivers). We have a PPC -> AARCH64 recompiler: https://github.com/dolphin-emu/dolphin/t...C/JitArm64
This one is a bit dumb (single pass recompiler), but it is near to feature complete. But it lacks a few optimization compared to the X64 one.
The PPC has 2*fp32 SIMD, which is implemented with NEON. But we don't have an auto-vectorizer in our recompiler.

Another NEON usecase is the vertex loader, but I think its performance is fine enough, but it should be profiled again to be sure: https://github.com/dolphin-emu/dolphin/b...rARM64.cpp

We lack a NEON texture decoder, but as we have a GLSL texture decoder, the need for it is much lower now: https://github.com/dolphin-emu/dolphin/b...er_x64.cpp

Edit: May I ask you to read page 6 to 14 of https://www.alchemistowl.org/pocorgtfo/pocorgtfo06.pdf (warning: 100MB). It is a bit dated and about the x64 JIT, but its statements are still valid.