This works around the issue that Movit might want to change from GLEW
to something else in the (very near) future; it's maybe not very pretty,
but it works pretty well in practice.
Dan Dennedy [Tue, 11 Feb 2014 01:29:50 +0000 (17:29 -0800)]
On Windows, ensure consumer-thread-create fires on caller thread.
This is needed to get GPU processing working on Shotcut for Windows and
probably other Qt 5 apps. However, that causes some bad behavior with
Movit on Linux. So, this change is only on Windows for now.
Dan Dennedy [Sat, 1 Feb 2014 05:37:10 +0000 (21:37 -0800)]
If LC_NUMERIC unsupported do not inadvertently change locale.
On Windows, we should be able to use _create_locale() and _free_locale()
but using them results in unresolved symbols linking on mingw 4.8.
Calling setlocale with fallback value of "" results in changing locale
to system-defined. With this change, on OS where LC_NUMERIC changing is
not supported we call with NULL, which makes the call passive.
Given that new/delete on such small objects are cheap and this happens
rarely, it is probably not worth the extra complexity. (In the process,
fix a minor bug related to out-of-memory; not that new will actually
ever return NULL on any compilers newer then MSVC6.)
FBOs are cheap to construct and delete (they carry almost no state),
so it is less complex just to do it on the fly. It also gives less
leakage, as we use new contexts all the time.
Having the MltInput be an Input which forwards down to the real implementation
has been a source of multiple headaches, and now lastly, when finalize()
disappeared, source of a broken build. We still need the unified
set_pixel_pointer() etc., but the class is now simply a holder of the Input*,
not a forwarder as viewed from the EffectChain.
Brian Matherly [Wed, 29 Jan 2014 18:44:09 +0000 (12:44 -0600)]
Save vidstab results to file.
Rather than save vidstab results (which can get quite large) in the properties, save them in a separate file.
Also redirect vid.stab log messages through the MLT logging system (sort of).
Dan Dennedy [Wed, 29 Jan 2014 06:40:53 +0000 (22:40 -0800)]
Add xml_retain property support to xml module.
This is used to serialize and deserialize extra services that are not
part of the lastmost service's graph. This is useful, for example, to
save and load a media bin as a playlist in addition to the main
multitrack graph. Or, it can be used for compound documents.
It is technically allowed to use GL_RGBA as internal format,
but then it is undefined whether you get 4-bit, 8-bit, 10-bit
or something else. Set it explicitly. (Since we give in a NULL
pointer, we can give whatever external format we want; just
hardcode it to GL_RGBA.)
This attempt at optimization is actually detrimental on modern CPUs.
Removing it helps playback speed ~0.3%. mlt_properties_get_data() is now
down to ~0.6%.
Call invalidate_pixel_data() after frame rendering.
This helps the input return its values back to the ResourcePool,
which means we won't be allocating ever more textures as we get
more clips on the timeline.
Dan Dennedy [Tue, 21 Jan 2014 06:34:50 +0000 (22:34 -0800)]
Fix a few problems with YCbCr colorspace conversion.
In avformat producer on libav (and FFmpeg < v2.1) conversion from RGB to
YCbCr would not use the destination colorspace because
sws_getColorspaceDetails() fails. Switch to calling only
sws_setColorspaceDetails().
In full luma yuvj420p->mlt_image_yuv420p conversion, the luma range was
always scaled down to MPEG range. The swscale implementation does not
let one override the range as the conversion routines are initialized at
the time a swscale context is allocated and initialized. Any changes in
sws_setColorspaceChanges() are mute.
In RGB->YCbCr conversion, the existing (source) colorspace was used
instead of the profile colorspace. Also, we need to set the new
colorspace as a property of the frame.
Brian Matherly [Mon, 20 Jan 2014 02:51:30 +0000 (20:51 -0600)]
Updates to vid.stab module.
* Correct some metadata
* Remove "reset" property by making deshake properties mutable.
* Implement "reload" for vidstab for reloading results.
* Misc. changes for MLT consistency.
This allows effects to signal that some sort of change means the chain
needs to be regenerated. In particular, this unbreaks changing the
matrix_size parameter of DeconvolutionSharpenEffect; if you change it,
the entire chain will now be regenerated, instead of getting an assertion
failure.
Stop special-casing the disable parameter for setting.
There are more parameters then just 'disable' that should be set before
chain finalization; in particular, DeconvolutionSharpenEffect compiles
the matrix size into the shader. Instead, just set all the parameters
once right after the chain has been built, which includes the disable
parameter.
Brian Matherly [Tue, 14 Jan 2014 13:44:34 +0000 (07:44 -0600)]
Updates to vid.stab module.
* Clean up serialization/deserialization
* results are not published until the analysis step is complete
* results are stored in "results" property
* Misc changes for MLT conventions and consistency
This allows us to get rid of effects that don't actually do anything
(like all the normalizers in the common case); in Movit, they tend
to burn a lot of memory bandwidth. We solve this by a new OptionalEffect
template, that can rewrite itself out of the graph if it sees it is
a no-op. We need to recreate the chain from scratch if this status
should change (e.g. the input resolution changed from one frame to the
next, and we thus suddenly need resizing after all), so we keep a
"fingerprint" string that contains all the unique IDs of the services
in use, as well as their disabled status, and compare against this frame.
Building the chain in one piece also opens up for transitions to be more
efficient; they are now built as part of one big Movit chain, instead of
bouncing to an 8-bit sRGB buffer and back.
* Change the mlt_glsl type.
Now, the mlt_glsl image type has a defined value, which is the
mlt_service pointer. Each filter is responsible for storing this input
service. This, together with the mlt_frame, enables us to actually
build the Movit chain based on the MLT relations, instead of just relying in
the order in which they are called and assuming everything has a single input.
As a special case, the value (mlt_service) -1 (which should never be a valid
pointer) means that we read the information from an input rather than an
effect. In this case, we take a copy of the pixel data we get in (since it
will soon be garbage collected), store it in an MltInput and then store that
MltInput for later use. This could probably be further simplified in the
future to get completely rid of MltInput and just use the regular
FlatInput/YCbCrInput instead.
This also requires us to change so that the chain is built and finalized at
the _end_ of the conversion steps (where it's logically needed), instead of at
the beginning as before. The beginning (conversion from * -> mlt_glsl) now
only stores the input as described below.
* Change Effect and EffectChain storage.
This changes the storage of Movit stuff as follows:
- The EffectChain (along with some associated information to be able
to more easily locate the services and Effect pointers; together,
called a GlslChain) is now stored on the output service, not on the input
producer. This allows us to have multiple EffectChains floating around.
- The Effect pointers no longer live permanently on the MLT graph, since
each MLT service can have more than one Effect. Instead, they live
temporarily on the frame (because the frame is not shared between
threads, giving us a poor man's version of thread-local storage),
until they reach the point where we decide if we need to rebuild the
EffectChain or not. At this point, they are either made part of the
chain (and owned by it), or disposed as unneeded.
- The MltInput also lives on the frame. (If we have multiple inputs,
we also have multiple frames.) As mentioned above, its use is signaled by
an mlt_service of -1.
* Change how Movit parameter setting works.
Services no longer set parameters directly on the Movit filters, since
they cannot know before the graph construction time whether the correct
destination is the newly created Effect, or a similar one in the EffectChain.
Instead, they set special properties (movit.parms.<type>.<name>[<index>]),
and then the convert filter uses these to set Movit parameters on the right
Effects.
Instead of having the client do deletion of fences, work around the problem
with missing contexts by adding them to a list in the GlslManager, which then
it garbage-collected before creating more fences.
The glFinish after rendering to a texture serves two purposes:
First, and maybe most importantly, it makes sure that if we send
the texture ID to another thread and try to draw it there, it is
actually valid in that context. (If not, the command to allocate
it could still be stuck in the queue, or the command to draw
the quad to the screen could be queued before the command to
actually render the image to the texture.)
Second, it makes sure we don't overwhelm the GPU with rendering
commands, especially in the readahead thread. GPUs have a long
pipeline, and our commands buffers are typically very short
(we render only one or a few quads per frame), which means that
we could queue so much rendering that we couldn't actually get
to display the frames, or do compositing and other normal UI tasks.
(GPUs are not all that good at scheduling.)
However, glFinish() also has an unwanted side effect: Since the
CPU waits for the GPU to finish, it means it cannot do anything
useful in that period; in particular, it cannot start decoding
input video for the next frame, which is very frequently a win.
Thus, we replace glFinish() with fences: One that we store on the
frame and that the client can wait for, and one that we wait for
ourselves before we render the next frame. The first fulfills
purpose #1 above (although a client that doesn't render in a
different thread can just ignore it), while the second fulfills
purpose #2. #2 does reduce the possible pipelining somewhat
(compared to not having any fence at all), but it seems that
the actual performance lost is very small in practice. In any
case, this is markedly faster than glFinish -- on my Intel HD 3000,
it increases GPU utilization from ~40% to over 80% in a typical
transition.
Note that this is an API change; a client that wants to send
the OpenGL texture number on to a different thread for display,
will now need to wait for the fence before it can actually draw
using it.
Make the Movit converter use the correct color primaries.
We need to distinguish between the YUV primaries and the color space;
for instance, my camera outputs Rec. 601/525 YUV but uses Rec. 709 color
primaries. Also fix so that we read the correct full_luma flag, not
just check force_full_luma. (Again, my camera outputs this.)
Movit doesn't support all the exotic color spaces ffmpeg/libav does,
but this should cover most of the common ones.