That's why I said built-in profilers suck; GPU drivers delay any operation as long as they can, and this includes RAM->VRAM transfers.
What you're probably seeing is that a vertex buffer was still in use when we tried writing new data into it. The GPU driver likely delayed the actual RAM->VRAM upload to the next DrawPrimitive() call, which is why you see so much CPU time spent there.
.. Or it's just the typical case that the CPU says "Yo GPU, here's some data to draw" and the GPU answers "Yo dawg, I'm still busy so wait a minute with that data".
What you're probably seeing is that a vertex buffer was still in use when we tried writing new data into it. The GPU driver likely delayed the actual RAM->VRAM upload to the next DrawPrimitive() call, which is why you see so much CPU time spent there.
.. Or it's just the typical case that the CPU says "Yo GPU, here's some data to draw" and the GPU answers "Yo dawg, I'm still busy so wait a minute with that data".
