Martin Storsjö [Wed, 30 Dec 2015 21:46:06 +0000 (23:46 +0200)]
checkasm: Check register clobbering on arm
Use two separate functions, depending on whether VFP/NEON is available.
This is set to require armv5te - it uses blx, which is only available
since armv5t, but we don't have a separate configure item for that.
(It also uses ldrd, which requires armv5te, but this could be avoided
if necessary.)
asfdec: reject size > INT64_MAX in asf_read_unknown
Both avio_skip and detect_unknown_subobject use int64_t for the size
parameter.
This fixes a segmentation fault due to infinite recursion.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Alexandra Hájková <alexandra.khirnova@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
asfdec: only set asf_pkt->data_size after sanity checks
Otherwise invalid values are used unchecked in the next run.
This can cause NULL pointer dereferencing.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Alexandra Hájková <alexandra.khirnova@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
Anton Khirnov [Thu, 17 Dec 2015 18:38:24 +0000 (19:38 +0100)]
h264: improve behaviour with invalid reference lists
Before 741b494fa8cd28a7d096349bac183893c236e3f9, when the reference list
modification description was invalid, the code would substitute the
corresponding reference from the initial ("default") reference list.
After that commit, it will just return an error.
Since there are apparently invalid samples in the wild that used to play
fine with the old code, it is a good idea to re-add some sort of error
resilience here. So, when the reference list modification results in a
missing frame, substitute a previous reference frame for it. The
relevant sample again decodes fine with the same output as previously.
Janne Grunau [Tue, 29 Dec 2015 11:08:38 +0000 (12:08 +0100)]
x86: use emms after ff_int32_to_float_fmul_scalar_sse
Intel's Instruction Set Reference (as of September 2015) clearly states
that cvtpi2ps switches to MMX state. Actual CPUs do not switch if the
source is a memory location. The Instruction Set Reference from 1999
(Order Number 243191) describes this behaviour but all later versions
I've seen have make no distinction whether MMX registers or memory is
used as source.
The documentation for the matching SSE2 instruction to convert to double
(cvtpi2pd) was fixed (see the valgrind bug
https://bugs.kde.org/show_bug.cgi?id=210264).
It will take time to get a clarification and fixes in place. In the
meantime it makes sense to change ff_int32_to_float_fmul_scalar_sse to
be correct according to the documentation. The vast majority of users
will have SSE2 so a change to the SSE version has little effect.
Fixes fate-checkasm on x86 valgrind targets.
Valgrind 'bug' reported as https://bugs.kde.org/show_bug.cgi?id=357059
Janne Grunau [Tue, 22 Dec 2015 21:51:55 +0000 (22:51 +0100)]
checkasm: x86: post commit review fixes
Check the full FPU tag word instead of only the lower half and simplify
the comparison.
Use upper-case function base name as macro name to instantiate both
checked_call variants.
dca: change the core to work with integer coefficients.
The DCA core decoder converts integer coefficients read from the
bitstream to floats just after reading them (along with dequantization).
All the other steps of the audio reconstruction are done with floats
which makes the output for the DTS lossless extension (XLL)
actually lossy.
This patch changes the DCA core to work with integer coefficients
until QMF. At this point the integer coefficients are converted to floats.
The coefficients for the LFE channel (lfe_data) are not touched.
This is the first step for the really lossless XLL decoding.
Janne Grunau [Fri, 11 Dec 2015 13:06:38 +0000 (14:06 +0100)]
x86: checkasm: check for or handle missing cleanup after MMX instructions
Not every asm routine is expected clear the MMX state after returning.
It is however a requisite for testing floating point code in checkasm.
Annotate functions requiring cleanup with declare_func_emms() and issue
emms after the call. The remaining functions are checked for having a
cleared MMX state after return.
opus: Fix typo causing overflow in silk_stabilize_lsf
Due to this typo max_center can be too large, causing nlsf to be set to
too large values, which in turn can cause nlsf[i - 1] + min_delta[i] to
overflow to a negative value, which is not allowed for nlsf and can
cause an out of bounds read in silk_lsf2lpc.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Janne Grunau [Thu, 3 Dec 2015 15:17:32 +0000 (16:17 +0100)]
arm: add ff_int32_to_float_fmul_array8_neon
Quite a bit faster than int32_to_float_fmul_array8_c calling
ff_int32_to_float_fmul_scalar_neon through FmtConvertContext.
Number of cycles per int32_to_float_fmul_array8 call while decoding
padded.dts on exynos5422:
before after change
cortex-a7: 1270 951 -25%
cortex-a15: 434 285 -34%
Interesting are the differences between
int32_to_float_fmul_array8_neon_c and int32_to_float_fmul_array8_neon.
The former is current behaviour of calling
ff_int32_to_float_fmul_scalar_neon repeatedly from the c function,
The raw numbers differ since checkasm uses different lengths than the
dca decoder.
Janne Grunau [Wed, 9 Dec 2015 21:28:36 +0000 (22:28 +0100)]
arm: add a cpu flag for the VFPv2 vector mode
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.
Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
Janne Grunau [Wed, 2 Dec 2015 23:12:39 +0000 (00:12 +0100)]
arm64: add cycle counter support
The ISB (instruction synchronization barrier) might be too heavy for
START/STOPTIMER use but should be more accurate in checkasm where the
timing overhead is subtracted.
Janne Grunau [Thu, 10 Dec 2015 20:49:30 +0000 (21:49 +0100)]
libavutil: move FFALIGN macro from common.h to macros.h
Include macros.h explicitly in common.h so that external code using
FFALIGN does not break. It was already implicitly included through
version.h. Include macros.h in lls.h and internal.h for FFALIGN.
lls.h was including common.h only for FFALIGN and internal.h was
missing the include for FFALIGN. `make checkheaders` did not catch it
because it's an internal header.
Stefan Pöschel [Sat, 12 Dec 2015 19:26:38 +0000 (20:26 +0100)]
mpegtsenc: add flag to embed an AC-3 ES the DVB way
So far an AC-3 elementary stream is refered to in the PMT according to
System A (ATSC). However System B (DVB) has a different way to signal an AC-3
ES within the PMT. This different way can be enabled by a new flag. The flag is
more generally named 'system_b' as there are further differences between ATSC
and DVB (e.g. the signalling of E-AC-3) which should then also be covered by it
in the future.
Anton Khirnov [Sun, 11 Oct 2015 09:08:24 +0000 (11:08 +0200)]
flvdec: do not create any streams in read_header()
The current muxer behaviour is to create streams in read_header() based
on the audio/video presence flags, but fill in the stream parameters
later when we actually get some packets for them. This is rather shady,
since other demuxers set the stream parameters immediately when the
stream is created and do not touch the stream codec context after that.
Change the flv demuxer to behave in the same way as other similar
demuxers -- create the streams only when we get a packet for them.
Anton Khirnov [Fri, 9 Oct 2015 13:16:46 +0000 (15:16 +0200)]
mpegaudiodecheader: check the header in avpriv_mpegaudio_decode_header
Almost all the places from which this function is called already check
the header manually and in the two that don't (the mp3 muxer) the check
should not cause any problems.
Anton Khirnov [Tue, 30 Dec 2014 06:51:04 +0000 (07:51 +0100)]
ff_parse_specific_params: do not use AVCodecContext.frame_size
It will not be set unless the muxing codec context is also the encoding
context, which is discouraged. When the frame size is not known from
av_get_audio_frame_duration(), the fallback should still be good enough.
Anton Khirnov [Sat, 15 Nov 2014 15:48:59 +0000 (16:48 +0100)]
rmenc: do not use AVCodecContext.frame_size
It will not be set if the stream codec context is not the encoding
context. Use av_get_audio_frame_duration() instead, it should work for
all audio codecs supported by the muxer.
Aaron Colwell [Wed, 2 Dec 2015 23:13:18 +0000 (18:13 -0500)]
matroskadec: Fix sample_aspect_ratio for stereo matroska content
matroskaenc applies divisors to the display width/height when generating
stereo content. This patch adds the corresponding multipliers to matroskadec
so that the original sample aspect ratio can be recovered.
Vittorio Giovara [Mon, 30 Nov 2015 17:19:36 +0000 (12:19 -0500)]
lavc: Drop exporting 2-pass encoding stats
These variables are coming from mpegvideoenc where are supposedly used
as bit counters on various frame properties. However their use is
unclear as they lack documentation, are available only from a very small
subset of encoders, and they are hardly used in the wild. Also frame_bits
in aacenc is employed in a similar way.
Remove this functionality from AVCodecContex, these variable are mostly
frame properties, and too few encoders support setting them with anything
useful.
Anton Khirnov [Sat, 15 Nov 2014 09:41:02 +0000 (10:41 +0100)]
mxfenc: always assume long gop
Checking the codec context parameters to find out this information is
far too unreliable to be useful, so it is safer to assume B-frames are
always present.
Anton Khirnov [Mon, 30 Nov 2015 16:51:48 +0000 (17:51 +0100)]
h264: eliminate default_ref_list
According to the spec, the reference list for a slice should be
constructed by first generating an initial (what we now call "default")
reference list and then optionally applying modifications to it.
Our code has an optimization where the initial reference list is
constructed for the first inter slice and then rebuilt for other slices
if needed. This, however, adds complexity to the code, requires an extra
2.5kB array in the codec context and there is no reason to think that it
has any positive effect on performance. Therefore, simplify the code by
generating the reference list from scratch for each slice.