swresample/resample: improve bessel function accuracy and speed
This improves accuracy for the bessel function at large arguments, and this in turn
should improve the quality of the Kaiser window. It also improves the
performance of the bessel function and hence build_filter by ~ 20%.
Details are given below.
Algorithm: taken from the Boost project, who have done a detailed
investigation of the accuracy of their method, as compared with e.g the
GNU Scientific Library (GSL):
http://www.boost.org/doc/libs/1_52_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/bessel/mbessel.html.
Boost source code (also cited and licensed in the code):
https://searchcode.com/codesearch/view/14918379/.
Accuracy: sample values may be obtained as follows. i0 denotes the old bessel code,
i0_boost the approach here, and i0_real an arbitrary precision result (truncated) from Wolfram Alpha:
type "bessel i0(6.0)" to reproduce. These are evaluation points that occur for
the default kaiser_beta = 9.
Reason for accuracy: Main accuracy benefits come at larger bessel arguments, where the
Taylor-Maclaurin method is not that good: 23+ iterations
(at large arguments, since the series is about 0) can cause
significant floating point error accumulation.
Benchmarks: Obtained on x86-64, Haswell, GNU/Linux via a loop calling
build_filter 1000 times:
test: fate-swr-resample-dblp-44100-2626
new: 995894468 decicycles in build_filter(loop 1000), 256 runs, 0 skips 1029719302 decicycles in build_filter(loop 1000), 512 runs, 0 skips 984101131 decicycles in build_filter(loop 1000), 1024 runs, 0 skips
old: 1250020763 decicycles in build_filter(loop 1000), 256 runs, 0 skips 1246353282 decicycles in build_filter(loop 1000), 512 runs, 0 skips 1220017565 decicycles in build_filter(loop 1000), 1024 runs, 0 skips
A further ~ 5% may be squeezed by enabling -ftree-vectorize. However,
this is a separate issue from this patch.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
swresample: allow double precision beta value for the Kaiser window
Kaiser windows inherently don't require beta to be an integer. This was
an arbitrary restriction. Moreover, soxr does not require it, and in
fact often estimates beta to a non-integral value.
Thus, this patch allows greater flexibility for swresample clients.
Micro version is updated.
wm4 [Fri, 6 Nov 2015 12:02:16 +0000 (13:02 +0100)]
mmaldec: correct package buffering accounting
The assert in ffmmal_stop_decoder() could trigger sometimes. The
packets_buffered counter was indeed not correctly maintained, and
packets were not subtracted from it if they were still in the waiting
queue.
For some reason, this happened especially with VC-1.
Nicolas George [Mon, 26 Oct 2015 20:07:33 +0000 (21:07 +0100)]
lavu/opt: enhance printing durations.
Trim unneeded leading components and trailing zeros.
Move the formating code in a separate function.
Use the function also to format the default value, it was currently
printed as plain integer, inconsistent to the way it is parsed.
Nicolas George [Sun, 25 Oct 2015 15:31:00 +0000 (16:31 +0100)]
lavfi: add testsrc2 test source.
Similar to testsrc, but using drawutils and therefore
supporting a lot of pixel formats instead of just rgb24.
This allows using it as input for other tests without
requiring a format conversion.
It is also slightly faster than testsrc for some reason.
The buffer needs s->bpp bytes, at maximum currently 10.
Assert that s->bpp is not larger.
This fixes a stack buffer overflow.
Reviewed-by: wm4 <nfxjfg@googlemail.com> Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
The return type of strlen is size_t, i.e. unsigned, so if pd->buf_size
is 3, the right side overflows leading to a wrong result of the
comparison and subsequently a heap buffer overflow.
Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Fixes out of array access Fixes: b877a6b788a25c70e8b1d014f8628549/asan_heap-oob_1da2c3f_2324_5a1b329b0b3c4bb6b1d775660ac56717.r3d Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avcodec/microdvddec: Check for string end in 'P' case
Fixes out of array read Fixes: a9502b60f4cecc19475382aee255f73c/asan_heap-oob_1e87fba_2548_a8ad47f6dde36644fe9cdc444d4632d0.sub Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Sebastian Dröge [Thu, 5 Nov 2015 22:35:42 +0000 (23:35 +0100)]
mpegtsenc: Add support for muxing Opus in MPEG-TS
Signed-off-by: Sebastian Dröge <sebastian@centricular.com>
Previous version reviewed-by: Kieran Kunhya <kierank@obe.tv> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avcodec/faxcompr: Add missing runs check in decode_uncompressed()
Fixes out of array access Fixes: 54e488b9da4abbceaf405d6492515697/asan_heap-oob_32769b0_160_a8755eb08ee8f9579348501945a33955.TIF Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes out of array access Fixes: 24d05e8b84676799c735c9e27d97895e/asan_heap-oob_1b70f6a_2955_7c3652a7f370f9f3ef40642bc2c99bb2.bit Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avcodec/truemotion1: Initialize mb_change_byte only when needed
Fixes out of array read Fixes: d92114d8c2a019b8a6e50cd2a7301b54/asan_heap-oob_26bf563_60_1d3420277533de9dbf8aba3f93af346f.avi Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This speeds up build_filter by ~ 50%. This gain should be pretty
consistent across all architectures and platforms.
Essentially, this relies on a observation that the filters have some
even/odd symmetry that may be exploited during the construction of the
polyphase filter bank. In particular, phases (scaled to [0, 1]) in [0.5, 1] are
easily derived from [0, 0.5] and expensive reevaluation of function
points are unnecessary. This requires some rather annoying even/odd
bookkeeping as can be seen from the patch.
I vaguely recall from signal processing theory more general symmetries allowing even greater
optimization of the construction. At a high level, "even functions"
correspond to 2, and one can imagine variations. Nevertheless, for the sake
of some generality and because of existing filters, this is all that is
being exploited.
Currently, this patch relies on phase_count being even or (trivially) 1,
though this is not an inherent limitation to the approach. This
assumption is safe as phase_count is 1 << phase_bits, and is hence a
power of two. There is no way for user API to set it to a nontrivial odd
number. This assumption has been placed as an assert in the code.
To repeat, this assumes even symmetry of the filters, which is the most common
way to get generalized linear phase anyway and is true of all currently
supported filters.
As a side note, accuracy should be identical or perhaps slightly better
due to this "forcing" filter symmetries leading to a better phase
characteristic. As before, I can't test this claim easily, though it may
be of interest.
Patch tested with FATE.
Sample benchmark (x86-64, Haswell, GNU/Linux):
test: swr-resample-dblp-44100-2626
new: 527376779 decicycles in build_filter(loop 1000), 256 runs, 0 skips 524361765 decicycles in build_filter(loop 1000), 512 runs, 0 skips 516552574 decicycles in build_filter(loop 1000), 1024 runs, 0 skips
old: 974178658 decicycles in build_filter(loop 1000), 256 runs, 0 skips 972794408 decicycles in build_filter(loop 1000), 512 runs, 0 skips 954350046 decicycles in build_filter(loop 1000), 1024 runs, 0 skips
Note that lower level optimizations are entirely possible, I focussed on
getting the high level semantics correct. In any case, this should
provide a good foundation.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
avcodec/aacsbr_template: replace qsort with AV_QSORT
When sbr->reset is set in encode_frame, a bunch of qsort calls might get made.
Thus, there is the potential of calling qsort whenever the spectral
contents change.
AV_QSORT is substantially faster due to the inlining of the comparison callback.
Thus, the increase in performance should be worth the increase in binary size.
avcodec/rawenc: Cast argument for av_image_copy_to_buffer() to const
Fixes: libavcodec/rawenc.c:64:40: warning: passing argument 3 of av_image_copy_to_buffer from incompatible pointer type [enabled by default] Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Bryan Huh [Mon, 2 Nov 2015 18:20:39 +0000 (10:20 -0800)]
avformat/cache: Use int64_t to avoid int overflow in cache_read
Fixes an issue where an int64_t ffurl_seek return-value was being stored
in an int (32-bit) "r" variable, leading to integer overflow when seeking
into a large file (>2GB), and ultimately a "Failed to perform internal
seek" error mesage.
To test, try running `ffprobe 'cache:http://<something>'` on a file that
is ~3GB large, whose moov atom is at the end of the file
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
all: use FFDIFFSIGN to resolve possible undefined behavior in comparators
FFDIFFSIGN was created explicitly for this purpose, since the common
return a - b idiom is unsafe regarding overflow on signed integers. It
optimizes to branchless code on common compilers.
FFDIFFSIGN also has the subjective benefit of being easier to read due
to lack of ternary operators.
Tested with FATE.
Things not covered by this are unsigned integers, for which overflows
are well defined, and also places where overflow is clearly impossible,
e.g an instance where the a - b was being done on 24 bit values.
This is of use for defining comparator callbacks. Common approaches like
return x-y are not safe due to the risks of overflow.
Furthermore, the (x > y) - (x < y) trick is optimized to branchless
code.
This also documents this macro accordingly.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Tobias Rapp [Thu, 29 Oct 2015 08:11:37 +0000 (09:11 +0100)]
avutil/file_open: avoid file handle inheritance on Windows
Avoids inheritance of file handles on Windows systems similar to the
O_CLOEXEC/FD_CLOEXEC flag on Linux.
Fixes file lock issues in Windows applications when a child process
is started with handle inheritance enabled (standard input/output
redirection) while a FFmpeg transcoding is running in the parent
process.
Describes handle inheritance when creating new processes. Handle
inheritance must be enabled (bInheritHandles = TRUE) e.g. when you want
to pass handles for stdin/stdout via lpStartupInfo.
Signed-off-by: Tobias Rapp <t.rapp@noa-audio.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>