Dolphin, the GameCube and Wii emulator - Forums

Full Version: How does Dolphin work?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
So while I was doing my daily browsing through the Github page, I thought of something - how does this thing actually work? How does Dolphin know how to play games? It's a fairly vague question, I know, but one I'd love to hear the answer to. Even a brief description of how CPU or DSP emulation works would really make my day. Thanks in advance!
There's actually a fairly similar discussion thread here that you may be interested in reading, if you haven't seen it already!

Just note that it did veer off course...
This explanation doesn't actually do justice to everything this emulator actually does but I will explain it as simply as I can.

What Dolphin does is look at the code of the game and then turns that code into code readable by a PC.

ex. If code x tells the wii to draw something on the screen Dolphin then translates that code into whatever the equivalent piece of code that a PC uses to accomplish the same function.
What, no dev or regular responses yet? Guess I'll throw my 2 cents into this then. Not that I'm an expert with the GC or Wii, but I'll describe some of the high-level stuff that's pretty much generic for most any emulator, and elaborate on what bomblord said.

@AdderDee - Dolphin is a large program (almost any emulator will be) and the GC/Wii aren't exactly "simple" machines. You should probably break down what you want to know into smaller components, rather than asking for a broad overview. I mean, so much goes on in Dolphin, that compressing it into a small paragraph fails to really give you an idea of how Dolphin works, even on a superficial level. If you asked stuff like "How is the CPU emulated?", "How are textures emulated?", or "How are memory cards emulated?", you're much more likely to get a concise, detailed answer from a developer or knowledgeable user. Anyway, here goes nothing...

In order to properly emulate the GC/Wii, Dolphin needs to reliably emulate the individual components that make up the consoles. As bomblord hinted, one of those components is the PowerPC based CPU found in the GC/Wii. There at least two other processors found in both systems: the GPU and the DSP. I'll assume you know what each do, if it's not obvious enough even to the average Dolphin user (in case not, they handle video and audio processing respectively). But wait, there's more! Dolphin also has to emulate other aspects of the two systems, such as how they handle memory (reading or writing to different locations, which locations can be read/written to and when, how memory is transferred, the byte order of data), how they handle controller input, how they handle file I/O (for the discs, memory cards, etc), how it handles interrupts, and the list goes on. Anything that the GC/Wii does, Dolphin needs code to get the host system recreate.

The CPU is probably the easiest point for non-developers to understand. The CPU deals with instructions that tell the processor to do things like "write this value to this memory location", or "add this value to that value, save the results here". Instructions are the building blocks for programs, and in the case of the GC/Wii, this means games. The CPU is feed a constant (or near constant, I guess, depending on how you view pipelining) series of bytes; these bytes form the actual instruction itself. While to the ordinary user, a series of bytes that looks like 0x3800FFFF means absolutely nothing, to the GC/Wii's PPC CPU, this tells it to do a specific action (in this case, it tells the CPU to load a value of -1 into one of its registers). It's up to Dolphin to read these bytes, determine what kind of instruction it represents, and faithfully recreate the results of its execution. The "game code" Dolphin translates is raw binary data that represents PowerPC assembly. So to begin, Dolphin emulates the CPU, whereby it continually fetches new instructions, executes them, fetches even more, executes even more, and continues on in this fashion until something causes this behavior to stop (an error or exception for examples).

Moving onto the approaches Dolphin takes for CPU emulation, there are two types we generally concern ourselves with: interpretation or recompilation. With interpretation, we emulate the instructions the CPU executes using equivalent code written in some higher-level language (C++ in our case). With recompilation, we emulate these instructions using the assembly language native to the host PC. That is to say, Dolphin generates x86 or ARM assembly and then executes the code. The type of recompilation Dolphin does is dynamic because all of this translation happens at runtime. This "dynarec" is also simply referred to as the JIT.

The interpreter is slow due to the fact that not all of the C++ is or can be optimized, and emulating a PPC this way simply takes a lot of hardware resources. However, since it is written in a high-level language (C++), that makes it very portable to other platforms that don't yet have their own JITs recompilers (basically anything that isn't x86 or ARM based, or for whatever reason prohibits JIT recompilers). The interpreter is also used for debugging purposes, and even Dolphin's JIT recompiler defers to the interpreter for instructions it can't handle. The JIT recompiler, on the other hand is quite fast since it can optimize code on-the-fly and whatever assembly it generates is likely to be orders faster than the C++ code in the interpreter. On the downside, the JIT recompiler needs to be changed to account for different assembly languages. It wasn't until relatively recently that the JIT recompiler could produce ARM code.

The GPU and its emulation is... not my domain. :p It's a mysterious bag of mysteries to me, but we'll see if even my sketchy knowledge helps. At any rate, eventually the CPU will want to do more than process silly maths. Adding and multiplying numbers only gets one so far unless you can display something. The CPU can issue commands to the GPU telling it to draw certain data. Dolphin can either emulate the GC/Wii GPU in software or via hardware accelerated backends. Using software rendering, Dolphin manually calculates all of the vertices, matrixes, color blending, and etc using... well, software, or more accurately, CPU-side code. As you might imagine, this is a slow process, and your FPS will reflect that. The result, however, is usually per-pixel accuracy (thanks to having more or less complete control over each calculation) and it creates a solid base for developers to debug any issues with the hardware backends. These hardware backends use APIs (Direct3D11 and OpenGL currently) that access your GPU to do all of these calculations rather than your CPU. Naturally, since this is the job your GPU was born to do, it's quite fast relatively speaking in comparison to your CPU. The problem developers face, however, is that there are many aspects where your GPU does not (and perhaps cannot) act like the GC/Wii's "Flipper" hardware. Say, D3D11 or OGL simply may not permit operations that the GC/Wii can natively do with ease, therefore, workarounds and approximations come into play. There have been a number of articles on Dolphin's main site (especially as it relates to D3D9's removal) so take a look at that for more info.

The DSP works on the idea of microcodes, often called "ucodes" for short (the u is supposed to be ยต, but nobody can be bothered to type it up in informal discussions...) These ucodes define a set of rules that determine how the CPU communicates audio data to the DSP and how the DSP processes and outputs that sound. They're like miniature "standards" used by the GC and Wii since the ruleset is known and generally adhered to by most programs running on these consoles. As it concerns commercials games, there are a few ucodes: the infamous "Zelda" ucode used in Nintendo EAD games, the AX and AXWii ucodes used in just about every other game, the GBA ucode, the CARD ucode, and two others (ROM and INIT) that I'm not entirely sure what they do. As an aside, both Skyward Sword and NSMB Wii, despite being co-developed by Nintendo EAD, use the AXWii ucode. The GC LoZ: Collector's Edition uses the AX ucode to my knowledge, and this game too was co-developed Nintendo EAD, so any co-developed may be excluded from the Zelda ucode list. I haven't researched this, so someone please verify and correct the above statements.

Generally, the CPU and DSP communicate via "mailbox" registers that are used to pass along commands (in the case of the AX and AXWii, the registers pass along the address to said commands) and via Direct Memory Access for audio data. Delroth has already explained how Dolphin emulates the AX ucode in this blog post though I'm aware that the AXWii ucode has a few differences. For information about the Zelda ucode, see here. Note that our current HLE implementation of the Zelda ucode is incomplete. It's near perfect, but there's still a bit left to reverse engineer. We're almost there, or at least more there than not there at all.

At any rate, Dolphin has two methods of emulating the DSP, what we call High Level Emulation and Low Level Emulation, or HLE and LLE respectively. With LLE, Dolphin will try to emulate the DSP much the same as it does the CPU, that is, Dolphin will feed the emulated DSP instructions, execute those instructions, and repeat the process over and over. The DSP is really its own fully programmable co-processor. It has its own internal registers and its own assembly language. The GC/Wii DSP has its own ROM (called Instruction ROM) and a coefficient table (used for mixing? I am unsure percisely what it does); in order to emulate the DSP via LLE, Dolphin needed these bits of data because it would have been impossible to step through each of the DSP instructions without them. Previously, users had to dump these files manually and point Dolphin to them. However, someone initially reverse engineered files that would work for the Zelda ucode, then Delroth further reverse engineered it to work with the AX and AXWii ucodes, and these files are included with Dolphin. Since, under LLE, Dolphin is emulating the DSP in much the same fashion as the CPU, the DSP has options for an interpreter or a JIT recompiler.

HLE on the otherhand, takes a different approach to DSP emulation. Rather than stepping instruction-by-instruction, HLE relies on the well-known predictability of the ucodes. Remember, the ucodes are set, defined ways the game handles audio. Say Dolphin's emulated CPU writes to one of the emulated DSP's mailbox registers. Depending on the contents, Dolphin can say something along the lines of "Hey, the CPU wants me to do XYZ. No need to run lots of instructions, I'll just grab the audio data from this area, mix it at the correct time, and presto!" It's all about abstracting the process. In reality, whenever a real DSP receives something in its mailbox register, it processes dozens (if not hundreds?) of instructions to determine what needs to be done and how to do it. Since we can predict what will happen when, Dolphin can essentially skip the process of running those individual instructions and go about getting the results.

Therein lies the speed difference between the two approaches. LLE takes many more steps than HLE, so it demands more from your system's hardware resources. HLE has to do far less to get the same results, thus it's less stressing on the CPU. However, since Dolphin's DSP under LLE goes through all of the steps a real GC/Wii DSP would take, it is technically more accurate than HLE. In the past, the difference was noticable, to the point that a quite a few games "required" LLE audio for any sort of decent audio quality. However, the HLE implementation of the DSP at the time was simply inaccurate. The situation has since reversed, and now LLE audio is hardly recommended except in a limited number of circumstances. HLE audio is not entirely perfect in Dolphin (some sounds "pop" in SMG1 and 2 for example) but we've never been in a better position all of these years.

And that's about all I want to write for tonight. I didn't even scratch a tenth of the stuff Dolphin really does; there are all sorts of lower-level stuff going. Tickles the mind doesn't it? Wink Anyway, big disclaimer in that I'm not a Dolphin dev (yet) and my experience with programming a GC emulator barely involves futzing around with the DSP and handling input. Though I am an emulator developer, that doesn't certify me to speak about Dolphin. Just thought I'd share what I know (or what I think I know). Additions and corrections are warranted and wanted.

On a side note, I was just thinking about this the other day, but perhaps some Dolphin Articles might be dedicated to explaining how certain aspects of Dolphin work (like CPU emulation). It'd be a bit of work to compile, but people seem to eat it up every time things like that get posted.
I'll try to summarize it!

Dolphin has a CPU core that parses the input PowerPC instructions, recompiles them into blocks of x86 code, and runs that code, jumping from compiled block to compiled block to match the execution of instructions on the original console. As new code comes in or old code is invalidated, it updates that recompiled code as necessary. It maintains a layer that emulates the memory layout of the original Gamecube/Wii games, to make sure that all memory accesses work as they would on the console. (This paragraph is the part I've mainly worked on, also known as the JIT)

Dolphin emulates the hardware that connects to the CPU, such as the memory card, audio DSP, controllers, and the FIFO that passes commands from the CPU to the GPU. Finally, it has a GPU engine that takes these commands and parses them into meaningful OpenGL/DirectX commands (vertexes, shaders, textures, etc) and manages CPU/GPU interactions, like reading the framebuffer or updating textures. All of this then plugs into a user interface to let you play the actual game.

Edit: agh, beaten to the punch Wink
Thanks a lot for the explanation. I have two additional questions: What's the MMU and what is JITIL?
The "MMU" is a piece of hardware that maps virtual memory locations to physical memory locations in a CPU, plus it handles things like page faults (accessing a virtual memory area that isn't mapped to physical memory) and other issues of memory access permissions. Basically every computer has one, so to emulate the Gamecube/Wii, Dolphin has to emulate the MMU.

Most games only require a very superficial and simple emulation; others require "full MMU" mode (which still isn't the full thing, just... well, better than nothing). Full MMU mode is much slower, so games that require it tend to be really slow.

Some games don't even work with full MMU; they likely make use of the MMU beyond what Dolphin supports at all, possibly as a form of anti-emulation protection (Disney Infinity, Cars 2, Toy Story 3).
(09-14-2014, 03:13 PM)Fiora Wrote: [ -> ]Some games don't even work with full MMU; they likely make use of the MMU beyond what Dolphin supports at all, possibly as a form of anti-emulation protection (Disney Infinity, Cars 2, Toy Story 3).

I think that's mostly because Dolphin's MMU code is horribly broken in TranslatePageAddress:
- Page permissions are ignored
- Recording bits aren't tracked in the TLB entries
- If there's a hit in the TLB, processing ends; a PTE lookup may still be required to update C (theoretically all PTEs in the TLB will have R set already)
- Stores and instruction fetches fail to set R
...and JITIL does pretty much the same as the regular JIT does; except it does it differently.
GC/Wii run a PowerPC processor, unlike your PC which is x86-based - and as such, they have a different instruction set. You could say the GC/Wii speaks French while your PC speaks English.
JIT directly translates from PPC to x86, while JITIL uses an intermediate language in-between which allows for further optimization (think Java or .NET, where you get Bytecode/CIL instead of native machine code; which later on gets translated to native).
At the moment, its probably better for you to stick with JIT, as JITIL was broken until a few revisions ago, and is still behind the JIT in terms of performance.
Thanks for your answers again. Why is emulating the MMU so complicated? Did the programmers manage the memory manually? When programming on Windows or whatever you just create variables and don't worry too much about where they are usually. The OS is managing the memory for you, right?
Pages: 1 2