Dolphin, the GameCube and Wii emulator - Forums

Full Version: Hyrule Field Slowdown Observation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
(07-09-2010, 11:27 PM)skid Wrote: [ -> ]
(07-09-2010, 09:46 PM)darkshadw Wrote: [ -> ]And no, high textures makes the emulator slower :p.

Even if low res textures are used?

Try it out, Super Mario Bros Wii, 8-1 or 8-3 when the smoke is comming after you. Play this with high txture and then without, you will notice the big slowdown when it is enabled (like 30 FPS slowdown).
?? That must be a bug then... Since there's no improvement in activating it (as long as there's no pack...wait for Djipi to get in the mood again Wink ), you should just leave it disabled anyway...
Has this commented code always been in bpstructs.cpp?

Quote:/*
----------------------------------------------------------------------------------------------------------------
Purpose: Writes to the BP registers
Called: At the end of every: OpcodeDecoding.cpp ExecuteDisplayList > Decode() > LoadBPReg
How It Works: First the pipeline is flushed then update the bpmem with the new value.
Some of the BP cases have to call certain functions while others just update the bpmem.
some bp cases check the changes variable, because they might not have to be updated all the time
NOTE: it seems not all bp cases like checking changes, so calling if (bp.changes == 0 ? false : true)
had to be ditched and the games seem to work fine with out it.
NOTE2: Yet Another Gamecube Documentation calls them Bypass Raster State Registers but possibly completely wrong
NOTE3: This controls the register groups: RAS1/2, SU, TF, TEV, C/Z, PEC
TODO: Turn into function table. The (future) DisplayList (DL) jit can then call the functions directly,
getting rid of dynamic dispatch. Unfortunately, few games use DLs properly - most\
just stuff geometry in them and don't put state changes there
----------------------------------------------------------------------------------------------------------------
*/

// Debugging only, this lets you skip a bp update
//static int times = 0;
//static bool enable = false;

//switch (bp.address)
//{
//case BPMEM_CONSTANTALPHA:
// {
// if (times-- == 0 && enable)
// return;
// else
// break;
// }
//default: break;
//}

If not, and it was just added recently, perhaps this might be a sign that the issue is being looked into by the developers and an actual fix worked on. Although I might be getting a bit hopeful, but I can hope, can't I? Tongue

DX11 and OpenGL plugin still need an option for the hack, but no rush since DX9 is way faster with this game.
I don't get it. Some people with i7 say they've got no problems in Hyrule Field with the latest builds of Xtreme, but I'm having around 17 FPS still. Using recommended settings, running at 1280x720, windowed.

My specs:
Core i7 920
6 GB RAM 1333 MHz
Windows Vista 64-bit
GeForce 9800GT

Is it a VGA problem?

I'm running Dolphin x64 SVN Revision 5847, from Xtreme2damax.
Do you have Hyperthreading enabled? If so, disable it as it causes problems with Dolphin, there definitely isn't something right with your configuration or system since you should be seeing way more than that. Even with my Core2Duo E8500 and Geforce 9800 GT I get 22 FPS - 25 FPS in the slowest Hyrule Field in the game. If you have AA, and EFB scaling enabled, disable these options as a Geforce 9800 GT doesn't do very well with these. You might be able to get away with 4x SSAA and 1x or 2x EFB scaling with minimal impact on performance.
Thank you Xtreme2Damax, with your latest build, i only suffered HF @ 20 FPS in contrast to 11 FPS in some other builds..and that drugs are also gone...GEE THANKS.!!!!!

I will play ZTP again on DX11 when I will grab a HD5670. I don't know whether this card is good for dolphin or GTA IV, crysis, etc... just need some guidance..
Go for the 5770. If i ain't wrong, the 5670 is nearly the same as the 9800gt when compare in performance, it's just that 5670 has DX11. So do not waste money on the 5670.
The problem is tight budget.....I am getting 512 MB 5670 @ Rs.5500, and even I get 1 GB @ Rs. 7k, but I get 5770 at whooping Rs. 10k....that is my problem...and I don't have mumbo jumbo display... 512 MB is enough to play at 720p....i guess
512MB of vdram is plenty for your resolution, it's shader throughput that we're concerned about not video memory capacity. And crysis is a very shader heavy game. Also you should really make a thread for this. This really isn't the place to be talking about it, plus your asking people about it in multiple threads.
Nice work, fircrestsk8. I saw a very dramatic speed improvement. At the Castle Town warp point late in the game I was previously getting only about 16% speed, and the flush-pipeline hack brought that up to about 45 or 50%! That may still sound slow, but it's a speed that I could previously only get with very heavy frame skipping. (Interestingly, adding frame skipping now while using the hack doesn't actually do much at all.) The jit-integer reversion brings me up to about 60% speed.

I have several questions, which I'll try to keep brief.

(1) How was this type of hack discovered in the first place? I noticed that I could do a flavor=prof build, which uses a tool called OProfile for profiling-- and although OProfile is pretty awesome (as I've just discovered), the very detailed data it gives me doesn't really in any way suggest that FlushPipeline() was a major bottleneck. Something tells me that I need to better understand how dolphin's threads work and which things are in what thread.

(2) Why does that one little gpr.StoreFromX64 call in Jit_Integer.cpp make such a big difference? And why is there even an issue of whether it should be there or not?

(3) Anyone else here run Linux?

(4) Is Dolphin's performance under Linux generally similar to that under Windows? If not, how I can help make it so? (I don't want to install Windows, and I'd rather see the support for Open Source World improve if necessary, anyway.)

ETA: (5) Those of us with quad cores could potentially see improvements if certain things were split off into yet more threads. (And multicore is the future, it would seem.) Are there any good candidates for this? Some of the pipeline-y sorts of things, perhaps?