Streeter: GPU based decoding would speed up for large textures. OpenCL would be fine with using interop so that OpenGL / D3D can share a texture with OpenCL _and_ sync to each other (this isn't supported by the usual vendors). So with these interop, the gpu based texture decoder won't flush all GPU pipes (up to 10ms delay) for every small decoding stuff (less than 1ms).
BUT, there is no need to do this in OpenCL as both OpenGL and D3D are shader based which allow us to do exactly the same without syncing or stalling issues. Also the context switch delay is much smaller without switching the API.
So there is no need in fixing this OpenCL code, but there is need to implement such a decoder into our video backends. imo this could be merged in our efb2ram encoding shader which do _exactly_ the inverse job already on the gpu.
BUT, there is no need to do this in OpenCL as both OpenGL and D3D are shader based which allow us to do exactly the same without syncing or stalling issues. Also the context switch delay is much smaller without switching the API.
So there is no need in fixing this OpenCL code, but there is need to implement such a decoder into our video backends. imo this could be merged in our efb2ram encoding shader which do _exactly_ the inverse job already on the gpu.
