Stefano Sabatini [Thu, 14 Feb 2013 23:26:22 +0000 (00:26 +0100)]
lavfi/overlay: implement shortest option
Force termination when the overlay stream ends. Simplify scripting logic,
for example when an infinite source is used to generate a background for
a composite video.
Stefano Sabatini [Sun, 10 Feb 2013 20:32:37 +0000 (21:32 +0100)]
lavf/matroskaenc: avoid assert failure in case of cuepoints with duplicated PTS
Avoid to write more than one cuepoint per track and PTS in
mkv_write_cues(). This avoids a later assertion failure on "(bytes >=
needed_bytes)" in put_ebml_num() called from end_ebml_master(), in case
there are several cuepoints per track with the same PTS.
This may happen with files containing packets with duplicated PTS in the
same track.
* commit '7ebfb466aec2c4628fcd42a72b29034efcaba4bc':
h264: Don't store intra pcm samples in h->mb
get_bits: Return pointer to buffer that is the result of the alignment
* commit 'e5ffffe48d20642acc079166f0fa7d93a6a9f594':
h264chroma: Remove duplicate 9/10 bit functions
x86: Use simple nop codes for <= sse (rather than <= mmx)
vp56: Remove clear_blocks call, and clear alpha plane U/V DC only
u-bo1b@0w.se [Mon, 18 Feb 2013 19:47:45 +0000 (20:47 +0100)]
cinepak: More correct Cinepak decoder.
change the treatment of the strip y coordinates which previously did
not follow the description (nor did it behave like the binary decoder
on files with absolute strip offsets).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Ronald S. Bultje [Mon, 18 Feb 2013 01:01:26 +0000 (17:01 -0800)]
h264/svq3: Stop using draw_edges
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Ronald S. Bultje [Sun, 17 Feb 2013 22:52:24 +0000 (14:52 -0800)]
h264: Don't store intra pcm samples in h->mb
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Ronald S. Bultje [Tue, 12 Feb 2013 01:04:27 +0000 (17:04 -0800)]
h264: Add add_pixels4/8() to h264dsp, and remove add_pixels4 from dsputil
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Ronald S. Bultje [Tue, 29 Jan 2013 23:55:19 +0000 (15:55 -0800)]
x86: Use simple nop codes for <= sse (rather than <= mmx)
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
vp56: Remove clear_blocks call, and clear alpha plane U/V DC only
The non-alpha and alpha-Y planes are cleared in the idct_put/add()
calls. For the alpha U/V planes, we only care about the DC for entropy
context prediction purposes, the rest of the data is unused.
Ronald S. Bultje [Tue, 19 Feb 2013 05:03:02 +0000 (21:03 -0800)]
h264: integrate clear_blocks calls with IDCT.
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Ronald S. Bultje [Tue, 19 Feb 2013 05:03:01 +0000 (21:03 -0800)]
svq3: fix decoding residual blocks of b-frames.
The residual block data of 16x16 blocks was ignored for b-frames, which
leads to easy-to-identify artifacts. After this patch, the artifacts are
gone. Sample video: svq3_watermark.mov. (Fate results unaffected.)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Ronald S. Bultje [Mon, 18 Feb 2013 16:15:52 +0000 (08:15 -0800)]
split out ff_hwaccel_pixfmt_list_420[] over individual codecs.
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Ronald S. Bultje [Mon, 18 Feb 2013 02:20:17 +0000 (18:20 -0800)]
x86/dsputil: fix compilation when h263 decoder/encoder are disabled.
The symbol "ff_h263_loop_filter_strength" is defined in h263.c, but
the h263 loopfilter functions (in the .asm file) are not optimized
out (even though their function pointers are never assigned).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Ronald S. Bultje [Mon, 18 Feb 2013 01:01:26 +0000 (17:01 -0800)]
h264/svq3: stop using draw_edges.
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '488f87be873506abb01d67708a67c10a4dd29283':
roqvideodec: check dimensions validity
vqavideo: check chunk sizes before reading chunks
qdm2: check array index before use, fix out of array accesses
Ronald S. Bultje [Sun, 17 Feb 2013 22:52:24 +0000 (14:52 -0800)]
h264: don't store intra pcm samples in h->mb.
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Nicolas George [Sat, 16 Feb 2013 10:36:32 +0000 (11:36 +0100)]
doc/examples: do not allocate AVFrame directly.
The size of the AVFrame structure is not part of the ABI;
it can grow with later versions. Therefore, applications
are not supposed to allocate AVFrame directly, they are
supposed to use avcodec_alloc_frame() instead.