This allows us to get rid of effects that don't actually do anything
(like all the normalizers in the common case); in Movit, they tend
to burn a lot of memory bandwidth. We solve this by a new OptionalEffect
template, that can rewrite itself out of the graph if it sees it is
a no-op. We need to recreate the chain from scratch if this status
should change (e.g. the input resolution changed from one frame to the
next, and we thus suddenly need resizing after all), so we keep a
"fingerprint" string that contains all the unique IDs of the services
in use, as well as their disabled status, and compare against this frame.
Building the chain in one piece also opens up for transitions to be more
efficient; they are now built as part of one big Movit chain, instead of
bouncing to an 8-bit sRGB buffer and back.
* Change the mlt_glsl type.
Now, the mlt_glsl image type has a defined value, which is the
mlt_service pointer. Each filter is responsible for storing this input
service. This, together with the mlt_frame, enables us to actually
build the Movit chain based on the MLT relations, instead of just relying in
the order in which they are called and assuming everything has a single input.
As a special case, the value (mlt_service) -1 (which should never be a valid
pointer) means that we read the information from an input rather than an
effect. In this case, we take a copy of the pixel data we get in (since it
will soon be garbage collected), store it in an MltInput and then store that
MltInput for later use. This could probably be further simplified in the
future to get completely rid of MltInput and just use the regular
FlatInput/YCbCrInput instead.
This also requires us to change so that the chain is built and finalized at
the _end_ of the conversion steps (where it's logically needed), instead of at
the beginning as before. The beginning (conversion from * -> mlt_glsl) now
only stores the input as described below.
* Change Effect and EffectChain storage.
This changes the storage of Movit stuff as follows:
- The EffectChain (along with some associated information to be able
to more easily locate the services and Effect pointers; together,
called a GlslChain) is now stored on the output service, not on the input
producer. This allows us to have multiple EffectChains floating around.
- The Effect pointers no longer live permanently on the MLT graph, since
each MLT service can have more than one Effect. Instead, they live
temporarily on the frame (because the frame is not shared between
threads, giving us a poor man's version of thread-local storage),
until they reach the point where we decide if we need to rebuild the
EffectChain or not. At this point, they are either made part of the
chain (and owned by it), or disposed as unneeded.
- The MltInput also lives on the frame. (If we have multiple inputs,
we also have multiple frames.) As mentioned above, its use is signaled by
an mlt_service of -1.
* Change how Movit parameter setting works.
Services no longer set parameters directly on the Movit filters, since
they cannot know before the graph construction time whether the correct
destination is the newly created Effect, or a similar one in the EffectChain.
Instead, they set special properties (movit.parms.<type>.<name>[<index>]),
and then the convert filter uses these to set Movit parameters on the right
Effects.
Instead of having the client do deletion of fences, work around the problem
with missing contexts by adding them to a list in the GlslManager, which then
it garbage-collected before creating more fences.
The glFinish after rendering to a texture serves two purposes:
First, and maybe most importantly, it makes sure that if we send
the texture ID to another thread and try to draw it there, it is
actually valid in that context. (If not, the command to allocate
it could still be stuck in the queue, or the command to draw
the quad to the screen could be queued before the command to
actually render the image to the texture.)
Second, it makes sure we don't overwhelm the GPU with rendering
commands, especially in the readahead thread. GPUs have a long
pipeline, and our commands buffers are typically very short
(we render only one or a few quads per frame), which means that
we could queue so much rendering that we couldn't actually get
to display the frames, or do compositing and other normal UI tasks.
(GPUs are not all that good at scheduling.)
However, glFinish() also has an unwanted side effect: Since the
CPU waits for the GPU to finish, it means it cannot do anything
useful in that period; in particular, it cannot start decoding
input video for the next frame, which is very frequently a win.
Thus, we replace glFinish() with fences: One that we store on the
frame and that the client can wait for, and one that we wait for
ourselves before we render the next frame. The first fulfills
purpose #1 above (although a client that doesn't render in a
different thread can just ignore it), while the second fulfills
purpose #2. #2 does reduce the possible pipelining somewhat
(compared to not having any fence at all), but it seems that
the actual performance lost is very small in practice. In any
case, this is markedly faster than glFinish -- on my Intel HD 3000,
it increases GPU utilization from ~40% to over 80% in a typical
transition.
Note that this is an API change; a client that wants to send
the OpenGL texture number on to a different thread for display,
will now need to wait for the fence before it can actually draw
using it.
Make the Movit converter use the correct color primaries.
We need to distinguish between the YUV primaries and the color space;
for instance, my camera outputs Rec. 601/525 YUV but uses Rec. 709 color
primaries. Also fix so that we read the correct full_luma flag, not
just check force_full_luma. (Again, my camera outputs this.)
Movit doesn't support all the exotic color spaces ffmpeg/libav does,
but this should cover most of the common ones.
Jakub Ksiezniak [Thu, 9 Jan 2014 20:44:44 +0000 (21:44 +0100)]
Added a fourth filter, that combines both detect and transform passes.
* Increased a default smoothing factor, according to the original
vid.stab default settings.
* Added a deshake data clear when seeking is performed.
* Added a version check in configure script.
Dan Dennedy [Thu, 2 Jan 2014 06:43:51 +0000 (22:43 -0800)]
Add consumer-thread-create and consumer-thread-join events.
If an app listens to these, it can override the implementation of thread
creation and joining. Otherwise, if no listeners, it falls back to
pthread_create() and pthread_join() as usual. At this time, only the
base mlt_consumer uses this for real_time=1 or -1 only.
When doing glReadPixels(), make sure we read from the right FBO.
Newer versions of Movit clear the FBO attachment after rendering to an FBO
(so that it's harder to accidentally attach to the same FBO from multiple
threads), so we need to explicitly choose one to read from.
Dan Dennedy [Tue, 31 Dec 2013 04:05:20 +0000 (20:05 -0800)]
Refactor movit.convert, movit.mix, and movit.overlay.
To use new methods on GlslManager: render_frame_texture() and
render_frame_rgba(). The latter routine was changed to use GL_BGRA in
glReadPixels() to improve performance on more OpenGL implementation (per
Steinar Gunderson's recommendation).
Dan Dennedy [Wed, 6 Nov 2013 03:25:50 +0000 (19:25 -0800)]
Fix audio distortion in float -> int32 conversion.
This was noticeable when using sox filter and become prominant when
libavcodec introduced per-codec audio sample formats.
Also, add CLAMP to make code more readable.
Dan Dennedy [Mon, 28 Oct 2013 06:10:49 +0000 (23:10 -0700)]
Fix videostab2 interpolation.
This filter uses RGB mode, for which bicubic is broken. vid.stab still
to this day uses bilinear with packed pixel formats. In order to fix
bilinear, needed to remove extra calls to floor function.
Dan Dennedy [Thu, 24 Oct 2013 01:27:31 +0000 (18:27 -0700)]
Fix crash removing filter attached to a service.
There can still be frame objects that have a filter's get_image function
in its image processing stack. Need to add a reference to the filter on
the frame objects.
Dan Dennedy [Wed, 16 Oct 2013 05:24:02 +0000 (22:24 -0700)]
Fix serializing an xml producer by itself.
Applications should use the xml producer's _original_type and
_original_resource properties and coerce it to a playlist or tractor to
serialize the entire graph of nodes.
Dan Dennedy [Sun, 22 Sep 2013 01:34:01 +0000 (18:34 -0700)]
Fix some race conditions in mlt_consumer.
OpenGL apps need to receive consumer-thread-stopped *after* all of the
frames are closed. Otherwise, it may cleanup the GL context before
frames holding context resources are closed.
Some consumer threads call mlt_consumer_purge().
Dan Dennedy [Sat, 24 Aug 2013 18:33:04 +0000 (11:33 -0700)]
Add "close glsl" event to glsl.manager service.
Qt 5 apps (and possibly others) must use this because the OpenGL context
for rendering needs to be created and destroyed on the thread on which
it is actually used. This should be fired on the glsl.manager filter
instance inside of a consumer-thread-stopped mlt_event listener.