Dolphin, the GameCube and Wii emulator - Forums

Full Version: [Patch] dspHLE Mario galaxy 1 music
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Unfortunately you are going to have to deal with it for right now. It's either use the LLE plugin or deal with the music stopping, there is no other way.
(06-29-2010, 11:15 AM)Torin Wrote: [ -> ]Resulting in what i think, that this part of code from UCode_Zelda_voice.cpp:

Code:
int CUCode_Zelda::SizeForResampling(ZeldaVoicePB &PB, int size, int ratio) {
    // This is the little calculation at the start of every sample decoder
    // in the ucode.
    return (PB.CurSampleFrac + size * ConvertRatio(PB.RatioInt)) >> 16;

Is wrong. As i stated before with the RE, the return should be something like this:

Code:
return (size * ConvertRatio(PB.RatioInt)) >> 16;

And the PB.CurSampleFrac should be gone, and so, adapt the size to 0x50, when i reched this, i though that i should read the UC_Zelda.txt, and the surprise is this:

Code:
    // 0a1b 0950      lris        $AX1.L, #0x50
    // 0a1c a000      mulx        $AX0.L, $AX1.L
    // 0a1d a400      mulxac      $AX0.L, $AX1.L, $ACC0
    // 0a1e 1404      lsl         $ACC0, #4
    // 0a1f 8c00      clr15      

    // Compute how much data we need to read, to get 0x50 samples after resampling.
    // AC0.L is cursamplefrac, AX0.L is ratio.
    $ACC0 = (PB.CurrentSampleFrac + 0x50* PB.Ratio) << 4;

They look similar, only that this have the PB.CurentSampleFrac added to mulxac, and mulxac only multiplies the others ac. So i tried this, erasing the currentsamplefrac, and i obtained a very long time music, but after 4 or 5 repeats, don't remember well, it cut off and i got fifo error. The strange thing is that i think that the samples are being calculated wrong, so with each looping, the adress copied to mem is also wrong, so i think the factor for convertraio should be changed ( to 0x50 ? ). Greets

I decided to read this again, according to Zelda_Ucode.h, 0x50 is some unknown buffer. That might be part of the issue considering that it has been stated that the looping issue is a buffer or sync issue or both. Apparently when the last five samples of the PB are 0, it tries to buffer and sync the music but I wouldn't know where to look.

Yes the music in SMG1/SMG2 is PCM16, however the AFC decoder is used in some way to buffer the samples, sync and loop the music. I've spent the last two days trying to figure this and some other issues out. I feel I may be headed in the right direction, but all attempts so far have been unsuccessful, the best I've done so far is to make the music last a bit longer.

I have a question for the developers, I would like to know if any of the developers working on dsp emulation have pinpointed at what point in the code that looping fails? Has anyone actually debugged the code to get an estimate at least where the music halts. If I had an idea what point in the code this is failing I may be able to concentrate on that specific area instead of jumping about the code which is a bit confusing and much to handle at once.

I wonder where neXus or Torin is at since they've probably tried debugging, maybe you might have an idea Jack Frost?
Har har, I think I may be figuring some stuff out. I've been digging around and altering code like heck today and yesterday. I have the AFC code simplified down to the point that it looks almost like PCM8 and PCM16 without breaking anything. Made some other alterations, so far I don't think I fixed the looping issue afaik. Maybe it's the wishful thinking in me, but I think I may have improved sound somewhat. :/

I also think I may know where music is breaking in Super Mario Galaxy, not sure how complicated it will be or if it will be complicated to fix.

I think it's breaking around the following line of code as Torin mentioned. But I used the value that was at 0x50 which is an unknown buffer and modified the code. I had music in SMG going three times as long, and it sounded more smooth, like it was actually streaming smoothly and transitioning the loop for a while. Maybe when it reaches the end of the loop it tries to buffer the pb's and reset to loop to the correct start position. If it is able to do that successfully then music should loop without a problem. As for why it sometimes plays longer than other times I don't know, maybe the developers have an explanation for that.

Around here:

Code:
return (PB.CurSampleFrac + PB.Unk50[0x8] * PB.RatioInt) << 4;

I can't do this alone so I would appreciate ideas and feedback from the developers.
I can help you out with testing if thats any help

out of interest what does the return sample represent?
Well, you already know that AFC is far from correct, and likely just a modified copy of PCM16.
Not sure if my pastie is still up, but it should show the important parts related to that buffer.

Wheres that PB.Unk50 stuff from? Its not in my modified UCode_Zelda_Voice.cpp.
You're right it isn't, it is in the RE of the Zelda Ucode, the text file. According to Ucode_Zelda.h 0x50 appears to be an unknown buffer. I have the line of code Torin pointed out as such:

Code:
return (PB.CurSampleFrac + PB.Unk50[0x8] + size * ConvertRatio(PB.RatioInt)) >> 16;

That may not be correct, but it seems to be the only thing that works without destroying audio in Zelda Ucode games or crashing the emulator, it also seems to make the music last slightly longer. I realize that the current AFC code is way off, I am just trying to rework it so that it's more similar to PCM16 and PCM8 and managed to do so without breaking anything but have more work to do. I am going to use bits of the RE'd code on pastie as reference to try to hopefully get this working, but my lack of coding skills makes things more difficult which is why I could use the help.

The trick is to buffer the pb's properly and reset to the correct LoopStartPos. It shouldn't require a major rewrite of the AFC code as far as I know, there is probably a simpler way to do it. I do need to know where exactly in the code that the music is killed, so I was wondering if you could debug that Jack Frost if you would oblige? Smile

Do we know yet what the correct loop start position is for PCM16 and AFC audio? PB.Format also seems to be needed else screechy and echoing audio seems to happen with Zelda Ucode games. Trying to do this all without destroying audio is the main goal.

Another question, is there a way to simplify the following code?

Code:
PB.CurAddr = ((((((PB.RestartPos / 16) & 0xffff0000) * PB.Format) << 16) + (((PB.RestartPos / 16) & 0xffff) * PB.Format)) + PB.StartAddr) & 0xffffffff;

Thanks.
Just wondering if the following RE is for the AFC decoder in Zelda_Ucode.txt?

Quote:// SAMPLER DECODER FOR FORMAT 0x05, 0x09
void 073d_DECODE_0x05_0x09(_dest($AR3), _numberOfSamples($AC1.M), _len(AX1)) // AX1 is 0x50 all the time
{
073d 0092 0004 lri $CR, #0x0004
073f 8100 clr $ACC0
// 0740 2604 lrs $AC0.M, @0x0004
// 0741 b100 tst $ACC0
// 0742 02b4 0717 callne 0x0717
if (*0x0404)
0717_InitializeDecoderState()

// 0744 8100 clr $ACC0
// 0745 2601 lrs $AC0.M, @0x0001
// 0746 b100 tst $ACC0

if (0x0401 != 0)
{
// 0747 0294 07e5 jnz 0x07e5 // early out
GOTO 0x07e5;
}

0749 2232 lrs $AX0.H, @0x0032
074a c900 cmpar $ACC0, $AX1.H
074b 0293 070e jle 0x070e // another early out
074d 5500 subr $ACC1, $AX0.H

// 074e 02bf 06f9 call 0x06f9
06f9_Unk_PrepareSampleDecode()

// check if there are samples left ...
0750 223a lrs $AX0.H, @0x003a
0751 8600 tstaxh $AX0.H
0752 0294 0759 jnz 0x0759

0754 8100 clr $ACC0
0755 263b lrs $AC0.M, @0x003b
0756 8200 cmp
0757 0291 07ab jl 0x07ab
if ()
{
// compute how many samples we have to copy
0759 8100 clr $ACC0
075a 1fdf mrr $AC0.M, $AC1.M
075b 040f addis $ACC0, #0x0f
075c 147c lsr $ACC0, #-4
075d 1f7e mrr $AX1.H, $AC0.M

075e 0c00 lris $AC0.L, #0x00
075f 1404 lsl $ACC0, #4
0760 1f1e mrr $AX0.L, $AC0.M
0761 0a00 lris $AX0.H, #0x00
0762 8100 clr $ACC0
0763 263a lrs $AC0.M, @0x003a
0764 243b lrs $AC0.L, @0x003b
0765 5800 subax $ACC0, $AX0
0766 0290 0771 jge 0x0771
if ()
{
0768 8100 clr $ACC0
0769 263b lrs $AC0.M, @0x003b
076a 5c00 sub $ACC0, $ACC1
076b 2e32 srs @0x0032, $AC0.M
076c 8100 clr $ACC0
076d 2e3a srs @0x003a, $AC0.M
076e 2e3b srs @0x003b, $AC0.M
// 076f 029f 0777 jmp 0x0777
}
else
{
0771 2e3a srs @0x003a, $AC0.M
0772 2c3b srs @0x003b, $AC0.L
0773 0c00 lris $AC0.L, #0x00
0774 1fd8 mrr $AC0.M, $AX0.L
0775 5c00 sub $ACC0, $ACC1
0776 2e32 srs @0x0032, $AC0.M
}


0777 8100 clr $ACC0
0778 1fdb mrr $AC0.M, $AX1.H

// 0779 02bf 07eb call 0x07eb
07eb_AFCDecoder();

077b 2232 lrs $AX0.H, @0x0032
077c 8600 tstaxh $AX0.H
077d 0295 07a8 jz 0x07a8
077f 0a10 lris $AX0.H, #0x10
0780 8100 clr $ACC0
0781 1fc3 mrr $AC0.M, $AR3
0782 5400 subr $ACC0, $AX0.H
0783 1c7e mrr $AR3, $AC0.M
0784 0080 0458 lri $AR0, #0x0458
0786 197e lrri $AC0.M, @$AR3
0787 197a lrri $AX0.H, @$AR3
0788 100e loopi #0x0e
0789 64a2 movr'sl $ACC0, $AX0.H : $AC0.M, $AX0.H
078a 1b1e srri @$AR0, $AC0.M
078b 1b1a srri @$AR0, $AX0.H
// 078c 8100 clr $ACC0
// 078d 263a lrs $AC0.M, @0x003a
// 078e 243b lrs $AC0.L, @0x003b
// 078f b100 tst $ACC0
// 0790 0294 07a8 jnz 0x07a8
if (![3a,3b]) {
0792 2232 lrs $AX0.H, @0x0032
0793 8600 tstaxh $AX0.H
0794 0295 07a8 jz 0x07a8
0796 0080 0467 lri $AR0, #0x0467
// 0798 8100 clr $ACC0
// 0799 268b lrs $AC0.M, @0xff8b
// 079a b100 tst $ACC0
// 079b 0295 07a8 jz 0x07a8
if (*0x048b) {
// Round up
079d 0200 000f addi $AC0.M, #0x000f
079f 0240 000f andi $AC0.M, #0x000f
07a1 0200 0458 addi $AC0.M, #0x0458
07a3 1c7e mrr $AR3, $AC0.M
// backwards copy loop
07a4 007a 07a7 bloop $AX0.H, 0x07a7
07a6 18fe lrrd $AC0.M, @$AR3
07a7 1a9e srrd @$AR0, $AC0.M
}
}
07a8 0092 00ff lri $CR, #0x00ff

// 07aa 02df ret
return
}
else
{
07ab b100 tst $ACC0
07ac 0295 07bb jz 0x07bb
07ae 5d00 sub $ACC1, $ACC0
07af 040f addis $ACC0, #0x0f
07b0 147c lsr $ACC0, #-4
07b1 0c00 lris $AC0.L, #0x00
07b2 00e3 0363 sr @0x0363, $AR3

// 07b4 02bf 07eb call 0x07eb
07eb_AFCDecoder();

07b6 00de 0363 lr $AC0.M, @0x0363
07b8 223b lrs $AX0.H, @0x003b
07b9 4400 addr $ACC0, $AX0.H
07ba 1c7e mrr $AR3, $AC0.M


07bb 8100 clr $ACC0

// Check repeat mode.
07bc 2681 lrs $AC0.M, @0xff81
07bd b100 tst $ACC0
07be 0295 07e3 jz 0x07e3 // stop rendering, see below 7e3

// Repeat.
// 07c0 2380 lrs $AX1.H, @0xff80
// 07c1 2688 lrs $AC0.M, @0xff88
// 07c2 2489 lrs $AC0.L, @0xff89
// 07c3 1408 lsl $ACC0, #8
// 07c4 14f4 asr $ACC0, #-12

$ACC0 = PB.LoopStartPos >> 4

//07c5 2380 lrs $AX1.H, @0xff80
//07c6 8d00 set15
//07c7 c810 mulc'mv $AC0.M, $AX1.H : $AX0.L, $AC0.L

$AX0.l = (PB.LoopStartPos >> 4) & 0xffff;
prod = (PB.LoopStartPos >> 4 & 0xffff0000)*PB.Format;

//07c8 ae00 mulxmv $AX0.L, $AX1.H, $ACC0

$ACC0 = (PB.LoopStartPos >> 4 & 0xffff0000)*PB.Format;
prod = ((PB.LoopStartPos >> 4) & 0xffff)*PB.Format;

//07c9 8c00 clr15
//07ca f000 lsl16 $ACC0

$ACC0 = (((PB.LoopStartPos >> 4) & 0xffff0000)*PB.Format)<<16

//07cb 4e00 addp $ACC0

$ACC0 = ((((PB.LoopStartPos >> 4) & 0xffff0000)*PB.Format)<<16)+
(((PB.LoopStartPos >> 4) & 0xffff)*PB.Format)

// 07cc 238c lrs $AX1.H, @0xff8c
// 07cd 218d lrs $AX1.L, @0xff8d
// 07ce 4a00 addax $ACC0, $AX1

$ACC0 = (((((PB.LoopStartPos >> 4) & 0xffff0000)*PB.Format)<<16)+
(((PB.LoopStartPos >> 4) & 0xffff)*PB.Format))+PB.StartAddr

// 07cf 2e38 srs @0x0038, $AC0.M
// 07d0 2c39 srs @0x0039, $AC0.L

PB.CurAddr = $ACC0 & 0xffffffff;

// 07d1 2682 lrs $AC0.M, @0xff82
// 07d2 2e67 srs @0x0067, $AC0.M
// 07d3 2683 lrs $AC0.M, @0xff83
// 07d4 2e66 srs @0x0066, $AC0.M
//Unconditionally (!) copy YN1 and YN2 from loopyn2 and loopyn1

PB.YN1 = PB.LoopYN1;
PB.YN2 = PB.LoopYN2;

07d5 00e3 0363 sr @0x0363, $AR3
07d7 0083 0458 lri $AR3, #0x0458
07d9 8100 clr $ACC0
07da 0e01 lris $AC0.M, #0x01

// 07db 02bf 07eb call 0x07eb
07eb_AFCDecoder();

07dd 00c3 0363 lr $AR3, @0x0363
07df 02bf 0729 call 0x0729
07e1 029f 0749 jmp 0x0749

// No repeat
// stop rendering of this PB (0x401 == 1) and clear the output buffer with zeroes...
//07e3 0e01 lris $AC0.M, #0x01
//07e4 2e01 srs @0x0001, $AC0.M

PB.KeyOff = 1;

early_out:
// Zero the buffer.
07e5 8100 clr $ACC0
07e6 005f loop $AC1.M
07e7 1b7e srri @$AR3, $AC0.M
07e8 0092 00ff lri $CR, #0x00ff

// 07ea 02df ret
return
}
}




void 07eb_AFCDecoder(_numberOfSample(AC0.M))
{
// 07eb 00ff 0360 sr @0x0360, $AC1.M
// 07ed 00fe 0361 sr @0x0361, $AC0.M
// 07ef 2638 lrs $AC0.M, @0x0038
// 07f0 2439 lrs $AC0.L, @0x0039
// 07f1 0f05 lris $AC1.M, #0x05
// 07f2 02bf 05ad call 0x05ad
05ad_SetupAccelerator(AC0.M, AC0.L, AC1.M)

// 07f4 2638 lrs $AC0.M, @0x0038
// 07f5 2439 lrs $AC0.L, @0x0039
// 07f6 8900 clr $ACC1
// 07f7 00df 0361 lr $AC1.M, @0x0361
// 07f9 2280 lrs $AX0.H, @0xff80
// 07fa d000 mulc $AC1.M, $AX0.H
// 07fb 6f00 movp $ACC1
// 07fc 4c00 add $ACC0, $ACC1
// 07fd 2e38 srs @0x0038, $AC0.M
// 07fe 2c39 srs @0x0039, $AC0.L
// increase sample offset in ARAM
AC0 = (*0x0038 << 16) | *0x0039
AC1 = AC0 + _numberOfSample * *0x0480 // bytes per sample
*0x0038 = AC0.M
*0x0039 = AC0.L


// 07ff 8100 clr $ACC0
// 0800 00de 0361 lr $AC0.M, @0x0361
//0802 007e 086b bloop $AC0.M, 0x086b
for (int i = 0; i < _numberOfSample; i++)
{
// Look for the lrrn below to find the ARAM reads.

// FFD3 seems to be some interface to do plain single byte reads
// from ARAM with no ADPCM fanciness or similar.

// It loads through AR0 loaded with immediate #ffd3, not through
// lrs, so CR doesn't affect the effective address.

0804 0080 ffd3 lri $AR0, #0xffd3
0806 0084 0000 lri $IX0, #0x0000
0808 199e lrrn $AC0.M, @$AR0
0809 8900 clr $ACC1
080a 1ffe mrr $AC1.M, $AC0.M
080b 1401 lsl $ACC0, #1
080c 0240 001e andi $AC0.M, #0x001e
080e 0200 0300 addi $AC0.M, #0x0300 // AFC COEF Table
0810 1c3e mrr $AR1, $AC0.M
0811 157c lsr $ACC1, #-4
0812 0340 000f andi $AC1.M, #0x000f
0814 0a11 lris $AX0.H, #0x11
0815 5500 subr $ACC1, $AX0.H

// 0816 8100 clr $ACC0
// 0817 2680 lrs $AC0.M, @0xff80
// 0818 0605 cmpis $ACC0, #0x05
// 0819 0295 0832 jz 0x0832
if (*0x480 != 0x5) // ( == 0x09)
{
081b 009a 00f0 lri $AX0.H, #0x00f0
081d 0b0f lris $AX1.H, #0x0f
081e 0082 0364 lri $AR2, #0x0364
0820 1998 lrrn $AX0.L, @$AR0
0821 6000 movr $ACC0, $AX0.L

// Unpack 14 of the nibbles..
0822 1107 0829 bloopi #0x07, 0x0829
for (int j=0; j<7; j++)
{
0824 3400 andr $AC0.M, $AX0.H
0825 1408 lsl $ACC0, #8
0826 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M

0827 3644 andr'ln $AC0.M, $AX1.H : $AX0.L, @$AR0
0828 140c lsl $ACC0, #12
0829 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
}
// Then do the last two ..
082a 3400 andr $AC0.M, $AX0.H
082b 1408 lsl $ACC0, #8
082c 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
082d 3600 andr $AC0.M, $AX1.H
082e 140c lsl $ACC0, #12
082f 1b5e srri @$AR2, $AC0.M

0830 029f 0852 jmp 0x0852
}
else // (*0x480 == 5)
{
0832 009a c000 lri $AX0.H, #0xc000
0834 0082 0364 lri $AR2, #0x0364
0836 1998 lrrn $AX0.L, @$AR0
0837 6000 movr $ACC0, $AX0.L

// Unpack half nibbles (half quality, ~half space)
//0838 1103 0845 bloopi #0x03, 0x0845
for (j=0; j<3; j++)
{
083a 1408 lsl $ACC0, #8
083b 3400 andr $AC0.M, $AX0.H
083c 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
083d 140a lsl $ACC0, #10
083e 3400 andr $AC0.M, $AX0.H
083f 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
0840 140c lsl $ACC0, #12
0841 3400 andr $AC0.M, $AX0.H
0842 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
0843 140e lsl $ACC0, #14
0844 3444 andr'ln $AC0.M, $AX0.H : $AX0.L, @$AR0
0845 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
}

0846 1408 lsl $ACC0, #8
0847 3400 andr $AC0.M, $AX0.H
0848 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
0849 140a lsl $ACC0, #10
084a 3400 andr $AC0.M, $AX0.H
084b 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
084c 140c lsl $ACC0, #12
084d 3400 andr $AC0.M, $AX0.H
084e 6032 movr's $ACC0, $AX0.L : @$AR2, $AC0.M
084f 140e lsl $ACC0, #14
0850 3400 andr $AC0.M, $AX0.H
0851 1b5e srri @$AR2, $AC0.M
}

0852 8f00 set40
0853 1f7f mrr $AX1.H, $AC1.M
0854 2066 lrs $AX0.L, @0x0066
0855 2767 lrs $AC1.M, @0x0067
0856 193a lrri $AX0.H, @$AR1
0857 1939 lrri $AX1.L, @$AR1
0858 0080 0364 lri $AR0, #0x0364
085a 1c80 mrr $IX0, $AR0
085b a000 mulx $AX0.L, $AX1.L
085c ea70 maddc'l $AC1.M, $AX1.L : $AC0.M, @$AR0

// ADPCM decoding main loop.
085d 1108 0866 bloopi #0x08, 0x0866
for (int i=0; i<8; i++)
{
085f 3a93 orr'sl $AC0.M, $AX1.H : $AC1.M, $AX1.L
0860 a478 mulxac'l $AX0.L, $AX1.L, $ACC0 : $AC1.M, @$AR0
0861 1485 asl $ACC0, #5
0862 e833 maddc's $AC0.M, $AX1.L : @$AR3, $AC0.M
0863 3b92 orr'sl $AC1.M, $AX1.H : $AC0.M, $AX1.L
0864 a570 mulxac'l $AX0.L, $AX1.L, $ACC1 : $AC0.M, @$AR0
0865 1585 asl $ACC1, #5
0866 ea3b maddc's $AC1.M, $AX1.L : @$AR3, $AC1.M
}
0867 2f67 srs @0x0067, $AC1.M
0868 8e00 set16
0869 1ff8 mrr $AC1.M, $AX0.L
086a 2f66 srs @0x0066, $AC1.M
086b 8900 clr $ACC1
}
086c 00df 0360 lr $AC1.M, @0x0360
086e 02df ret
}

I also need to know what PB values ACC0, ACC1, AC0.M, AX0.L, ACM.M, AX1.L, AX1.M etc.. refer to as that would make referencing the docs and RE easier for me.
Besides the buffer it appears that NumberOfSamples is not implemented or most of it's missing besides a few commented lines, it's probably not returning the correct amount of samples when it the music tries to loop. There also appears to be quite a bit missing, but I'm having trouble making heads or tales of the RE'd Ucode in Ucode_Zelda.txt.
Is there also a way to see what pb's memory addresses refer to? I want to try implementing FilterBufferInPlace/Filterstate in this block of code but can't find the pb values for the memory addresses in Zelda_Ucode.txt:

Quote:if (PB.FilterEnable != 0)
{ // 0x04a8
for (int i = 0; i < _Size; i++)
{
// TODO: Apply filter from ZWW: 0c84_FilterBufferInPlace
}
}

Need to know what 0x038f, 0x0520, 0x0484, 0x0440 is. Has this been RE'd yet, because I'm not able to find much.
by the pricking of my thumb, something wickedly working this way comes...
I just got a feeling ^^
don't get your hopes to high, Strip! Tongue
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18