For the callable function (as opposed to the inline one):
C SSE SSE2 SSE4
Win32: 47 42 29 26
Win64: 30 33 25 23
The SSE version is neither compiled nor set for ARCH_X86_64, as the
inlinable function takes over.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
Martin Storsjö [Tue, 4 Feb 2014 14:28:24 +0000 (16:28 +0200)]
arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols
This makes the generated assembly more internally consistent,
avoiding declaring two labels for the same function (for cases
where EXTERN_ASM is empty) and not declaring a separate unprefixed
label in other cases.
This also makes sure the .func and .type delcarations have the same
prefix. They have previously not been used on the platforms
that have prefixed symbols on arm (iOS), but gas-preprocessor
has recently started using the .func declarations for adding
.thumb_func declarations for such functions.
Janne Grunau [Fri, 31 Jan 2014 11:52:07 +0000 (12:52 +0100)]
fate: force the simple idct for xvid custom matrix test
The original test without a forced idct is still useful since it tests
the switching of the idct algorithm/permutation on x86 with MMX. MMXext
or SSE2. Make sure the test runs only if MMX inline asm is available and
force -cpuflags to all.
Add the required bitexact flag for both tests.
Luca Barbato [Mon, 20 Jan 2014 12:28:37 +0000 (13:28 +0100)]
lavf: improve handling of sparse streams when muxing
Currently ff_interleave_packet_per_dts() waits until it gets a frame for
each stream before outputting packets in interleaved order.
Sparse streams (i.e. streams with much fewer packets than the other
streams, like subtitles or audio with DTX) tend to add up latency and in
specific cases end up allocating a large amount of memory.
Emit the top packet from the packet_buffer if it has a time delta
larger than a specified threshold.
Original report of the issue and initial proposed solution by
mus.svz@gmail.com.
Bug-id: 31 Signed-off-by: Anton Khirnov <anton@khirnov.net>
Jan Ekström [Thu, 30 Jan 2014 08:37:53 +0000 (08:37 +0000)]
atrac3plus: Make initialization dependant on channel count rather than channel map
Makes it easier to recreate an AVCodecContext for ATRAC3+ decoding,
which is needed in multimedia frameworks, as well as in general cases
where demuxing and decoding are separate entities.
Ronald S. Bultje [Thu, 24 Oct 2013 10:54:32 +0000 (06:54 -0400)]
x86: videodsp: Properly mark sse2 instructions in emulated_edge_mc as such.
Should fix crashes or corrupt output on pre-SSE2 CPUs when they were
using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in
hfix or hvar single-edge (left/right) extension functions.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
Janne Grunau [Sat, 25 Jan 2014 16:34:19 +0000 (17:34 +0100)]
configure: clang: explicitly state dep file and rule name in DEPFLAGS
Fixes dependency file generation with gas-preprocessor.pl and clang.
Flags copied from GCC and tested with Apple's clang from Xcode 5 and
5.1 and clang 3.2, 3.3, 3.4 on Linux.
Janne Grunau [Fri, 24 Jan 2014 15:22:44 +0000 (16:22 +0100)]
mpeg12: check scantable indices in all decode_block functions
Add checks to the fast functions used with CODEC_FLAGS2_FAST and move
the check for all other functions to before the invalid memory is
accessed. Fixes https://trac.videolan.org/vlc/ticket/9713 with
CODEC_FLAGS2_FAST.
vp9: fix bugs in updating coef probabilities with parallelmode=1
- The memcpy was completely wrong because
s->prob_ctx[s->framectxid].coef is a [4][2][2][6][6][3] array, whereas
s->prob.coef is a [4][2][2][6][6][11] array.
- The additional check was committed to ffmpeg by Ronald S. Bultje.
Janne Grunau [Mon, 20 Jan 2014 17:16:54 +0000 (18:16 +0100)]
h264: skip chroma edges at the picture boundary while deblocking 4:4:4
This handles macroblock edges for the chroma components in the same way
as for the luma compoment for 4:4:4 streams. The Spec explicitly states
that the deblocking filter is not applied to edges at the boundary of
the picture.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
Anton Khirnov [Mon, 20 Jan 2014 14:22:09 +0000 (15:22 +0100)]
lavc: do not force the emu edge flag
The default get_buffer2() implementation (and possibly some
user ones) does not allocate edges when this flag is set, which may
expose bugs in some decoders. Until the 10 release is out, it is safer
to remove this part.
vc1: Always reset numref when parsing a new frame header.
Fixes an issue where the B-frame coding mode switches from interlaced
fields to interlaced frames, causing incorrect decisions in the motion
compensation code and resulting in visual artifacts.
CC: libav-stable@libav.org Signed-off-by: Tim Walker <tdskywalker@gmail.com>
Justin Ruggles [Thu, 16 Jan 2014 20:59:05 +0000 (15:59 -0500)]
mov: do not set avg_frame_rate in the demuxer
The track duration is often not reliable or is not the duration
represented by the number of frames. In those cases, avg_frame_rate
was reported incorrectly. Removing this code falls back to the
default calculation in avformat_find_stream_info().
Anton Khirnov [Thu, 28 Nov 2013 09:54:35 +0000 (10:54 +0100)]
h264: reset num_reorder_frames if it is invalid
An invalid VUI is not considered a fatal error, so the SPS containing it
may still be used. Leaving an invalid value of num_reorder_frames there
can result in writing over the bounds of H264Context.delayed_pic.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Anton Khirnov [Thu, 28 Nov 2013 09:54:35 +0000 (10:54 +0100)]
h264: limit allowed pred modes in ff_h264_check_intra_pred_mode() to 3
Higher modes are not allowed for 16x16/chroma, which is what this
function is used for. Otherwise this function would return 0 (vertical
prediction) for invalid higher modes, which could result in invalid
reads.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Reviewed-by: Stephen Hutchinson <qyot27@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Anton Khirnov <anton@khirnov.net>