Creating a PBO to hold the texture data just before upload (like we
did before this patch) is pointless; if that would help at all, the driver
could just do it itself. Instead, we expose the PBOs to the application
(in a way such that applications that don't care can continue to use
the simple interface). This means that a client that needs to do
e.g. a fade can optimize its texture upload by a process like this:
1. Decode frame from input A.
2. Upload frame from input A to GPU (by putting it into a PBO).
Texture upload starts in the background.
3. Decode frame from input B.
4. Upload frame from input B to GPU. (This time, there will be
no parallelism, though.)
5. Render.
With correct use of ping-ponging PBOs, it is also possible to overlap
step 4/5 with operations from the _next_ frame in the fade.
More information can be found in this presentation: