This is, surprisingly, the most useful for VA-API decodes; they can
have long latency at 1080p, and Futatabi's dropping scheme sometimes
caused massive unfairness. Our system doesn't pipeline all that
nicely, so just having multiple threads was the simplest solution.
The risk is that we now access VA-API from multiple threads, which
has a tendency to tickle bugs, but we'll see.
Of course, for CPU decoding, you will also benefit.