Instead of having the client do deletion of fences, work around the problem
with missing contexts by adding them to a list in the GlslManager, which then
it garbage-collected before creating more fences.
The glFinish after rendering to a texture serves two purposes:
First, and maybe most importantly, it makes sure that if we send
the texture ID to another thread and try to draw it there, it is
actually valid in that context. (If not, the command to allocate
it could still be stuck in the queue, or the command to draw
the quad to the screen could be queued before the command to
actually render the image to the texture.)
Second, it makes sure we don't overwhelm the GPU with rendering
commands, especially in the readahead thread. GPUs have a long
pipeline, and our commands buffers are typically very short
(we render only one or a few quads per frame), which means that
we could queue so much rendering that we couldn't actually get
to display the frames, or do compositing and other normal UI tasks.
(GPUs are not all that good at scheduling.)
However, glFinish() also has an unwanted side effect: Since the
CPU waits for the GPU to finish, it means it cannot do anything
useful in that period; in particular, it cannot start decoding
input video for the next frame, which is very frequently a win.
Thus, we replace glFinish() with fences: One that we store on the
frame and that the client can wait for, and one that we wait for
ourselves before we render the next frame. The first fulfills
purpose #1 above (although a client that doesn't render in a
different thread can just ignore it), while the second fulfills
purpose #2. #2 does reduce the possible pipelining somewhat
(compared to not having any fence at all), but it seems that
the actual performance lost is very small in practice. In any
case, this is markedly faster than glFinish -- on my Intel HD 3000,
it increases GPU utilization from ~40% to over 80% in a typical
transition.
Note that this is an API change; a client that wants to send
the OpenGL texture number on to a different thread for display,
will now need to wait for the fence before it can actually draw
using it.
Make the Movit converter use the correct color primaries.
We need to distinguish between the YUV primaries and the color space;
for instance, my camera outputs Rec. 601/525 YUV but uses Rec. 709 color
primaries. Also fix so that we read the correct full_luma flag, not
just check force_full_luma. (Again, my camera outputs this.)
Movit doesn't support all the exotic color spaces ffmpeg/libav does,
but this should cover most of the common ones.
Jakub Ksiezniak [Thu, 9 Jan 2014 20:44:44 +0000 (21:44 +0100)]
Added a fourth filter, that combines both detect and transform passes.
* Increased a default smoothing factor, according to the original
vid.stab default settings.
* Added a deshake data clear when seeking is performed.
* Added a version check in configure script.
Dan Dennedy [Thu, 2 Jan 2014 06:43:51 +0000 (22:43 -0800)]
Add consumer-thread-create and consumer-thread-join events.
If an app listens to these, it can override the implementation of thread
creation and joining. Otherwise, if no listeners, it falls back to
pthread_create() and pthread_join() as usual. At this time, only the
base mlt_consumer uses this for real_time=1 or -1 only.
When doing glReadPixels(), make sure we read from the right FBO.
Newer versions of Movit clear the FBO attachment after rendering to an FBO
(so that it's harder to accidentally attach to the same FBO from multiple
threads), so we need to explicitly choose one to read from.
Dan Dennedy [Tue, 31 Dec 2013 04:05:20 +0000 (20:05 -0800)]
Refactor movit.convert, movit.mix, and movit.overlay.
To use new methods on GlslManager: render_frame_texture() and
render_frame_rgba(). The latter routine was changed to use GL_BGRA in
glReadPixels() to improve performance on more OpenGL implementation (per
Steinar Gunderson's recommendation).
Dan Dennedy [Wed, 6 Nov 2013 03:25:50 +0000 (19:25 -0800)]
Fix audio distortion in float -> int32 conversion.
This was noticeable when using sox filter and become prominant when
libavcodec introduced per-codec audio sample formats.
Also, add CLAMP to make code more readable.
Dan Dennedy [Mon, 28 Oct 2013 06:10:49 +0000 (23:10 -0700)]
Fix videostab2 interpolation.
This filter uses RGB mode, for which bicubic is broken. vid.stab still
to this day uses bilinear with packed pixel formats. In order to fix
bilinear, needed to remove extra calls to floor function.
Dan Dennedy [Thu, 24 Oct 2013 01:27:31 +0000 (18:27 -0700)]
Fix crash removing filter attached to a service.
There can still be frame objects that have a filter's get_image function
in its image processing stack. Need to add a reference to the filter on
the frame objects.
Dan Dennedy [Wed, 16 Oct 2013 05:24:02 +0000 (22:24 -0700)]
Fix serializing an xml producer by itself.
Applications should use the xml producer's _original_type and
_original_resource properties and coerce it to a playlist or tractor to
serialize the entire graph of nodes.
Dan Dennedy [Sun, 22 Sep 2013 01:34:01 +0000 (18:34 -0700)]
Fix some race conditions in mlt_consumer.
OpenGL apps need to receive consumer-thread-stopped *after* all of the
frames are closed. Otherwise, it may cleanup the GL context before
frames holding context resources are closed.
Some consumer threads call mlt_consumer_purge().
Dan Dennedy [Sat, 24 Aug 2013 18:33:04 +0000 (11:33 -0700)]
Add "close glsl" event to glsl.manager service.
Qt 5 apps (and possibly others) must use this because the OpenGL context
for rendering needs to be created and destroyed on the thread on which
it is actually used. This should be fired on the glsl.manager filter
instance inside of a consumer-thread-stopped mlt_event listener.