Can anyone compile this new patch, please?
I would do it myself, but I am being drowned by dozens of linker errors
I would do it myself, but I am being drowned by dozens of linker errors
[PATCH] DSP LLE faster masked math
|
11-26-2010, 09:42 PM
Can anyone compile this new patch, please?
I would do it myself, but I am being drowned by dozens of linker errors 11-27-2010, 11:00 PM
My understanding is that more sounds just means more work for LLE to do. In that sense making it faster would require making LLE faster.
Here is a build with the last patch I posted: 6482m x64 (with lle comp4 patch) 11-28-2010, 01:46 AM
I think that LLE on thread should be improved too. That option would be awesome for us with less-powered rigs, but with three or quad cores, still with one or two spare cores to use that remain idle when Dolphin is in use.
Corsair CX500 | Gigabyte GA-H61M-DS2 | Intel i3 3220 | 2x8GB Kingston DDR3 1333MHz | CT500MX500SSD1 (465 GB) + CT1000MX500SSD1 (931 GB) + WD40EZRZ-00GXCB0 (3726 GB) | Zotac GTX 750 | W10 21H1 x64 | X360 Pad + Switch Pro Controller
11-28-2010, 04:33 AM
I included the patched plugin in my latest build here:
http://www.xtemu.com/forum/files/categor...vn-builds/ or here: http://cid-b8ece3275d590dda.office.live....ic/Dolphin 11-28-2010, 11:22 PM
(11-25-2010, 03:55 PM)Mylek Wrote: There was a pretty big bug in my jit dec code that should be fixed now with this version. it is extremely likely that the hardware does not do the loops here, but instead just checks once: Code: + while ((s16)ar > (s16)mask) By the way, i have my own little implementation of the increase/decrease: Code: inline u16 dsp_add_addr_reg(u16 ar, s16 ix, u16 wr) { I'd really like to have some more tests done with ix > wr, like wr == 0, ix == 2, 3 and wr == 2, ix == 8.
Nice.
If we can ignore the case of multiple wraparounds then I think your code is a better approach. Thinking about it, it could be optimized further to even work without ToMask which should give a big speedup: Code: inline u16 dsp_add_addr_reg(u16 ar, s16 ix, u16 wr) { Doing some pseudocode this would reduce the entire add fuction to ~15 assembly instructions. Code: MOV AX, ar If the logic is sound it could be used to simplify all the other masked functions.
Implemented the minimal versions of the functions without ToMask based on the above code. Seems to work without any hitches from brief testing. The weakness with this implementation is the add/sub can go out of bounds if ix > wr but I'm not sure if this ever happens.
Then again, since we don't have any testing data on hardware for when ix > wr this could even be correct behavior. lle_masked_math_special.patch (Size: 13.46 KB / Downloads: 233) 11-29-2010, 10:24 AM
Mylek, any chances that this patch will be included to the official release since you are a dolphin developer? I am especially curious about what Xtreme2damax said that it fixes "garbage noise, robotic audio and static from before" (they made LLE unusable imo). Even if it lowers compatibility, those screeching high pitched noises make compatibility very low as things are now, can't see how this would make things worse.
11-29-2010, 11:14 AM
(This post was last modified: 11-29-2010, 11:16 AM by Xtreme2damax.)
There is still some relatively minor static with some games (Could of swore this was all gone except for one game with a minor issue, maybe the latest patch?), however it is more audible and clear. Just about all of the garbage noise, static, crackling and screeching is gone though.
In regards to performance improvements, as I said the real killer to performance is when multiple samples/effects are occurring at one time, these can decrease FPS by 10 - 25 or more depending on what is happening. By the way, my latest build has the patched LLE plugin but it isn't the latest patch: http://www.xtemu.com/forum/files/categor...vn-builds/ 11-29-2010, 04:13 PM
(11-29-2010, 08:44 AM)Mylek Wrote: Then again, since we don't have any testing data on hardware for when ix > wr this could even be correct behavior. http://home.amis.net/mpuljar/dolphin/AR_crap.7z here you have bunch of test data (dol-s, and wii results) |
« Next Oldest | Next Newest »
|