![]() |
|
Programming Discussion Thread - Printable Version +- Dolphin, the GameCube and Wii emulator - Forums (https://forums.dolphin-emu.org) +-- Forum: Offtopic (https://forums.dolphin-emu.org/Forum-offtopic) +--- Forum: Delfino Plaza (https://forums.dolphin-emu.org/Forum-delfino-plaza) +--- Thread: Programming Discussion Thread (/Thread-programming-discussion-thread) |
RE: Programming Discussion Thread - neobrain - 03-09-2014 (03-09-2014, 08:19 AM)Shonumi Wrote: @neobrain - That's pretty interesting. Just glancing over it, I think I can see where there's a pattern starting (seems to repeat for the rest of the indices after a certain point). Though what exactly it is (or rather why it is what it is) requires more poking. I'll take a look at it later tonight if you want, can't say I'll be useful though.Yeah, the pattern as far as I could see was that with each row, the value increases by 4 - apart from columns 0x21, 0x61, 0xA0 and 0xE0 (each of these plus or minus 1 because I'd have to look it up again to be sure), where the value increases merely by 3 compared to the previous row. That said, making a reasonable algorithm other than "if(row==0x21)..." out of this has proven hard to me. It probably looks dead-easy once we find the algorithm, though :/ For what it's worth, if you really want to give it a shot (which I'd greatly appreciate btw!), you can download a more complete log of test results at https://dl.dolphin-emu.org/nbx/log1.tar.gz . Format of that table is basically: a b c: hw tfn "a", "b" and "c" are the unsigned 8 bit input variables used in the lerp, "hw" is the returned result on hardware, "tfn" is the (incorrect) behavior of my current tev-fixes-new branch (which uses (a*(255-c) + b*c) / 255) * 4 to implement the lerp-scaling). The table includes all configurations for a=0..0x10 and any values for b and c. That said, I would probably be happy enough to find a reasonable algorithm for the a=0, b=255, c=variable case, the results of which I had posted before :| RE: Programming Discussion Thread - Shonumi - 03-09-2014 Sounds good, I'll take a look at the log you posted some time tonight. Might be on IRC later on then. RE: Programming Discussion Thread - neobrain - 03-09-2014 Fwiw, it'll likely be another 12 hours until I'll be back online. RE: Programming Discussion Thread - Shonumi - 03-09-2014 Well, I didn't actually get a chance to look through the log file yet, but I managed to pull together a small C++ program that reproduces the output for a=0, b=255, c=variable case (the pastie link you posted earlier). Thought it'd be more straightforward, but it looks like the original algorithm probably does something with signed numbers, since it apparently changes things up for whatever reason after the index is greater than 0x80. Here's the link for it: http://pastie.org/private/y32ei9yqkqhenbb5j2o7ca It's probably not intuitive (at least it's not an elegant one-liner, and it's probably not simplified as much as it could be). The "zones" refer to the 4 segments the algorithm seems to affect (indices 0x21 <-> 0x60 is Zone 0, 0x61 <-> 0xA0 is Zone 1, 0xA1 <-> 0xE0 is Zone 2, and 0xE0 <-> 0x20 is Zone 3. Basically I used the "zones" to determine what should be what I call the "additive". The basic formula should simply be (Index * 4) - additive, (masked by 0xFF obviously) but that only got me correct values for Index 0 through Index 0x80. Seems that after 0x80, the algorithm calculates values that are technically one step ahead of the Index value, i.e. using (Index * 4) - additive gives a value of 0xFE at Index 0x80, when on real hardware, it jumps immediately to a value of 0x2 at Index 0x80 (which just happens to be the next value at Index 0x81). So from Index 0x80 to 0xA0, the formula is then (Index+1 * 4) - additive After 0xA0, values are calculated with the regularly using (Index * 4) - additive but how the "zones" are calculated needs to be changed, e.g. treat Index 0xA0 as if it were part of Zone 2 (in which case the additive is not -2, but rather +1). Basically the "zone" boundaries are shifted by 1. Consequently, this also resets adding 1 to the Index, so for Index 0xA0 to 0xFF, the formula is the original (Index * 4) - additive. Sorry if my explanation seems a bit scattered or confusing, but hopefully this and the code helps you in some way. I'll give the log a good look-over tomorrow but I see if I can come up with something for that too. RE: Programming Discussion Thread - neobrain - 03-09-2014 Yeah, that's pretty much a more formalized description of what I was thinking. Unfortunately, it already breaks down when moving from the a=0,b=255 case to a=0,b=0x80: In that case, the formula c*2 in fact describes correct hardware behavior, whereas with the "- additive" tricks you would get different results :/ .. but then again that might depend on how you generalize the formular given above. For my testing, I used the general lerp function Code: int odd_lerp(int a, int b, int c)Fwiw, I've asked some mame developers for help and one of them told me that it'd probably be useful to have the full tables for different configurations (i.e. other scaling factors than 4). Hence that's what I'll be focusing on for the next few days/weeks (depending on how long the tests take to run... the configurations in the log I provided above already took hours to be tested, but the tests are somewhat unoptimized, so maybe I can cut it down). RE: Programming Discussion Thread - AnyOldName3 - 03-09-2014 Well I'm glad the devs aren't the kind of people who having already tested the actual hardware with all possible inputs would just put all of these in a lookup table. RE: Programming Discussion Thread - neobrain - 03-10-2014 (03-09-2014, 11:53 PM)AnyOldName3 Wrote: Well I'm glad the devs aren't the kind of people who having already tested the actual hardware with all possible inputs would just put all of these in a lookup table. Sometimes the actual hardware itself uses lookup tables, you know ![]() Just a quick update on the topic, I've today extended my testing code to also retrieve the upper 3 bits of the color combiner result, and it's actually fairly easy to see a pattern then. a=0,b=127 case: The "additive" constant has a jump at c=0x21,0x61,0xa0 and 0xe0. The corresponding output values (10 bit + 1 sign bit) are 65,192,319 and 446. It's easy to see that the distance between each of these is 127, and that 65 is the smallest integer greater than 128/2. Or seen differently, the last value for c which yields "expected" outputs is 0x20, for which we get the output 64, which is 128/2. a=0,b=255 case: The "additive" constant has a jump at c=0x21,0x61,0xa0 and 0xe0. The corresponding output values (10 bit + 1 sign bit) are 131,386,641 and 896. It's easy to see that the distance between each of these is 255, and that 131 is... well, not quite the smallest integer greater than 255/2. However, the last value for c which yields "expected" outputs is 0x20, for which we get the output 128, which is 256/2. These are just the two special cases I looked at so far. Either way, it's a huge step forward compared to yesterday, where I couldn't even find any reasonable pattern for the a=0,b=255 case. Let's see how easily I can extend the pattern to arbitrary values
RE: Programming Discussion Thread - Anti-Ultimate - 03-10-2014 What would be the benefits of doing whatever you're trying to figure out right now? RE: Programming Discussion Thread - delroth - 03-10-2014 Making Flipper emulation more accurate. RE: Programming Discussion Thread - Shonumi - 03-10-2014 Glad to hear you're making progress neobrain Hope everything goes according to plan on your end.
|