This sadly required making changes to the code itself,
due to the same context needing to be reused for both versions.
The lookup table had to be duplicated for both versions.
This commit refactors the power-of-two FFT, making it faster and
halving the size of all tables, making the code much smaller on
all systems.
This removes the big/small pass split, because on modern systems
the "big" pass is always faster, and even on older machines there
is no measurable speed difference.
avcodec/avcodec: Actually honour the documentation of subtitle_header
It is only supposed to be freed by libavcodec for decoders, yet
avcodec_open2() always frees it on failure.
Furthermore, avcodec_close() doesn't free it for decoders.
Both of this has been changed.
Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avformat/id3v2: Check end for overflow in id3v2_parse()
Fixes: signed integer overflow: 9223372036840103978 + 67637280 cannot be represented in type 'long' Fixes: 33341/clusterfuzz-testcase-minimized-ffmpeg_dem_DSF_fuzzer-6408154041679872 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036854775805 + 4 cannot be represented in type 'long' Fixes: 29927/clusterfuzz-testcase-minimized-ffmpeg_dem_MXF_fuzzer-5579985228267520 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avformat/wtvdec: Improve size overflow checks in parse_chunks()
Fixes: signed integer overflow: 32 + 2147483647 cannot be represented in type 'int Fixes: 32967/clusterfuzz-testcase-minimized-ffmpeg_dem_WTV_fuzzer-5132856218222592 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: Peter Ross <pross@xvid.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avformat/mov: check for pts overflow in mov_read_sidx()
Fixes: signed integer overflow: 9223372036846336888 + 4278255871 cannot be represented in type 'long' Fixes: 32782/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-6059216516284416 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
avcodec/jpeglsdec: Don't presume the context to contain a JLSState
Before 9b3c46a081a9f01559082bf7a154fc6be1e06c18 every call to
ff_jpegls_decode_picture() allocated and freed a JLSState. This commit
instead put said structure into the context of the JPEG-LS decoder to
avoid said allocation. But said function can also be called from other
MJPEG-based decoders and their contexts doesn't contain said structure,
leading to segfaults. This commit fixes this: The JLSState is now
allocated on the first call to ff_jpegls_decode_picture() and stored in
the context.
Found-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avcodec/utils: Check ima wav duration for overflow
Fixes: signed integer overflow: 44331634 * 65 cannot be represented in type 'int' Fixes: 32120/clusterfuzz-testcase-minimized-ffmpeg_dem_RSD_fuzzer-5760221223583744 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1184429040541376544 * 32 cannot be represented in type 'long' Fixes: 31788/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6236746338664448 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
doc/muxers.texi: fix build issue for unknown command
The build log:
** Unknown command `@code' (left as is) (in src/doc/muxers.texi l. 2020)
*** '{' without macro. Before: -map} option with the ffmpeg CLI tool. (in src/doc/muxers.texi l. 2020)
*** '}' without opening '{' before: option with the ffmpeg CLI tool. (in src/doc/muxers.texi l. 2020)
Relying on the order of the enum is bad.
It clashes with the new presets having to sit at the end of the list, so
that they can be properly filtered out by the options parser on builds
with older SDKs.
So this refactors nvenc.c to instead rely on the internal NVENC_LOSSLESS
flag. For this, the preset mapping has to happen much earlier, so it's
moved from nvenc_setup_encoder to nvenc_setup_device and thus runs
before the device capability check.
This would only make a difference in case the first attempt to
initialize the encoder failed and the second succeeded. The only
reason I can think of for this to happen is that the options (in
particular the codec whitelist) are not used for the second try
and that obviously implies that we should not even try a second time
to open the decoder.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avformat/matroskaenc: Remove unnecessary function calls
ffio_fill() is used when initially writing unknown length elements;
yet it can happen that the amount of bytes written by it is zero in
which case it is of course unnecessary to ever call it. Whether it is
possible to know this during compiletime depends upon how aggressively
the compiler inlines function calls (i.e. if it inlines calls to
start_ebml_master() where the upper bound for the size of the element
implies that the size will be written on one byte) and this depends upon
optimization settings. It is not the aim of this patch to inline all
calls where it is known that ffio_fill() will be unnecessary, but merely
to make compilers that inline such calls aware of the fact that writing
zero bytes with ffio_fill() is unnecessary. To this end
av_builtin_constant_p() is used to check whether the size is a
compiletime constant.
For GCC 10 this made a difference at -O3 only: The size of .text
decreased from 0x747F (with 29 calls to ffio_fill(), eight of which
use size zero) to 0x7337 (with 21 calls to ffio_fill(), zero of which
use size zero).
For Clang 11 it made a difference at -O2 and -O3: At -O2, the size of
.text decreased from 0x879C to 0x871C (with eight calls to ffio_fill()
eliminated); at -O3 the size of .text decreased from 0xAF2F to 0xAEBF.
Once again, eight calls to ffio_fill() with size zero have been
eliminated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avformat/wavdec: Fix reading files with id3v2 apic before fmt tag
Up until now the cover images will get the stream index 0 in this case,
violating the hardcoded assumption that this is the index of the audio
stream. Fix this by creating the audio stream first; this is also in
line with the expectations of ff_pcm_read_seek() and
ff_spdif_read_packet(). It also simplifies the code to parse the fmt and
xma2 tags.
avformat/id3v2: Don't reverse the order of id3v2 APICs
When parsing ID3v2 tags, special (non-text) metadata is not applied
directly and unconditionally; instead it is stored in a linked list
in which elements are prepended. When traversing the list to add APICs
(or private tags) at the end, the order is reversed. The same also
happens for chapters and therefore the chapter parsing code already
reverses the chapters.
This commit changes this: By keeping pointers to both head and tail
of the linked list one can preserve the order of the entries and
remove the reordering code for chapters. Only the pointer to head
will be exported: No current caller uses a nonempty list, so exporting
both head and tail is unnecessary. This removes the functionality
to combine the lists of special metadata read from different ID3v2 tags,
but that doesn't make really much sense anyway (and would be trivial
to implement if desired) and allows to remove the now unnecessary
initializations performed by the callers.
The FATE-reference for the id3v2-priv test had to be updated
because the order of the tags read into the dict is reversed;
for id3v2-priv-remux only the md5 and not the ffprobe output
of the remuxed file changes because the order of the private tags
has up until now been reversed twice.
The references for the aiff/mp3 cover-art tests needed to be updated,
because the order of the attached pics is reversed upon reading.
It is still not correct, because the muxers write the pics in the order
in which they arrive at the muxer instead of the order given by
pkt->stream_index.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
fate/cover-art: Add test for writing id3v2 tags and apic with AIFF/MP3
Notice that the order of the APIC tracks is currently wrong. This is
a superposition of two bugs: (i) Both muxers write the attached
pictures in the order they arrive in the muxer and not in the
stream_index order, leading to attached pictures that are copied being
written earlier because their timestamp is AV_NOPTS_VALUE, whereas the
timestamp of the encoded pictures is 0. (ii) A bug in the id3v2 parsing
code reverses the order of the parsed pictures.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: shift exponent 251 is too large for 32-bit type 'int' Fixes: 32147/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DPX_fuzzer-5519111675314176 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Guo, Yejun [Sun, 7 Feb 2021 06:36:13 +0000 (14:36 +0800)]
lavfi: add filter dnn_detect for object detection
Below are the example steps to do object detection:
1. download and install l_openvino_toolkit_p_2021.1.110.tgz from
https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit/download.html
or, we can get source code (tag 2021.1), build and install.
2. export LD_LIBRARY_PATH with openvino settings, for example:
.../deployment_tools/inference_engine/lib/intel64/:.../deployment_tools/inference_engine/external/tbb/lib/
3. rebuild ffmpeg from source code with configure option:
--enable-libopenvino
--extra-cflags='-I.../deployment_tools/inference_engine/include/'
--extra-ldflags='-L.../deployment_tools/inference_engine/lib/intel64'
4. download model files and test image
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/face-detection-adas-0001.bin
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/face-detection-adas-0001.xml
wget
https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/face-detection-adas-0001.label
wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/images/cici.jpg
5. run ffmpeg with:
./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,showinfo -f null -
We'll see the detect result as below:
[Parsed_showinfo_1 @ 0x560c21ecbe40] side data - detection bounding boxes:
[Parsed_showinfo_1 @ 0x560c21ecbe40] source: face-detection-adas-0001.xml
[Parsed_showinfo_1 @ 0x560c21ecbe40] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000.
[Parsed_showinfo_1 @ 0x560c21ecbe40] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000.
There are two faces detected with confidence 100% and 69.17%.
James Almer [Thu, 15 Apr 2021 04:28:27 +0000 (01:28 -0300)]
avformat/movenc: fix writing dOps atoms
Don't blindly copy all bytes in extradata past ChannelMappingFamily. Instead
check if ChannelMappingFamily is not 0 and then only write the correct amount
of bytes from ChannelMappingTable, as defined in the spec[1].
James Almer [Sun, 11 Apr 2021 01:53:34 +0000 (22:53 -0300)]
avformat/webpenc: don't assume animated webp streams will have more than one packet
The libwebp_animencoder returns a single packet with the entire animated
stream, as that's what the external library produces. As such, only ensure the
stream was produced by said encoder (or propagated by a demuxer, once support
is added) when attempting to write the requested loop value.
Tobias Rapp [Wed, 31 Mar 2021 09:41:49 +0000 (11:41 +0200)]
fftools/ffprobe: Remove check on show_frames and show_packets in XML writer
The "packets_and_frames" element has been added to ffprobe.xsd in 0c9f0da0f7656059e9bd41931d250aafddf35ea3 but apparently removing the
check in ffprobe.c has been forgotten.
avcodec/nellymoserenc: Fix segfault when using unsupported channels/rate
NellyMoserEncodeContext.avctx is only set in init after these checks,
yet it is used by encode_end().
This is a regression since 0a56bfa71f751a2b25da8d060a019c1c75ca9d7b.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
mov: Prioritize aspect ratio values found in pasp atom
From the ISO/IEC specification for MP4:
The pixel aspect ratio and clean aperture of the video may be specified
using the ‘pasp’ and ‘clap’ sample entry boxes, respectively. These are
both optional; if present, they over-ride the declarations (if any) in
structures specific to the video codec, which structures should be
examined if these boxes are absent. For maximum compatibility, these
boxes should follow, not precede, any boxes defined in or required by
derived specifications.
Martin Storsjö [Mon, 12 Apr 2021 07:31:22 +0000 (10:31 +0300)]
aarch64: h264pred: Optimize the inner loop of existing 8 bit functions
Move the loop counter decrement further from the branch instruction,
this hides the latency of the decrement.
In loops that first load, then store (the horizontal prediction cases),
do the decrement after the load (where the next instruction would
stall a bit anyway, waiting for the result of the load).
In loops that store twice using the same destination register,
also do the decrement between the two stores (as the second store
would need to wait for the updated destination register from the
first instruction).
In loops that store twice to two different destination registers,
do the decrement before both stores, to do it as soon before the
branch as possible.
This gives minor (1-2 cycle) speedups in most cases (modulo measurement
noise), but the horizontal prediction functions get a rather notable
speedup on the Cortex A53.
Despite this, a number of these functions still are slower than
what e.g. GCC 7 generates - this shows the relative speedup of the
neon codepaths over the compiler generated ones:
In particular, on Cortex A72, many of these functions are slower
than the compiler generated code, while they're more beneficial on
e.g. the Cortex A73.
fftools/ffmpeg_filter: Fix check for mjpeg encoder
The MJPEG encoder supports some pixel format/color range combinations
only when strictness is set to unofficial or less. Before commit 059fc2d9da5364627613fb3e6424079e14dbdfd3 said encoder's pix_fmts array
only included the pixel formats supported with default strictness.
When strictness was <= unofficial, fftools/ffmpeg_filter.c used
an extended list of pixel formats instead of the encoder's including
the pixel formats only supported when strictness <= unofficial.
Said commit turned the logic around: The encoder's pix_fmts array now
included all pixel formats and fftools/ffmpeg_filter.c instead used
a small list of all pixel formats supported when strictness is >
unofficial and the encoder's pixel formats instead. In particular,
the codec's pix_fmt is not used when strictness is normal.
This works for the mjpeg encoder; yet it did not work for other
(hardware-based) mjpeg encoders, because the check for whether one is
using the MJPEG encoder is wrong: It just checks the codec id.
So if one used strict unofficial with a hardware-accelerated MJPEG
encoder before commit 059fc2d9da53, the unofficial (non-hardware)
pixel formats of the MJPEG encoder would be used; since said commit
the codec's pixel formats are overridden at ordinary strictness
by the ordinary MJPEG pixel formats. This leads to format conversion
errors lateron which were reported in #9186.
The solution to this is to check AVCodec.name instead of its id.
Fixes ticket #9186.
Tested-by: Eoff, Ullysses A <ullysses.a.eoff@intel.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avformat/matroskaenc: Put subtitles without duration into SimpleBlocks
The value zero for AVPacket.duration means that the duration is unknown,
which in practice means "play this subtitle until overridden by the next
subtitle". Yet for Matroska a BlockGroup with duration zero means
that the subtitle really has a duration zero. "Display until overridden"
is achieved by not setting a duration on the container level at all and
this is achieved by using a SimpleBlock or a BlockGroup without
duration. This commit implements this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
avformat/mvi: Check audio_data_size to be non negative
Fixes: left shift of negative value -224 Fixes: 32144/clusterfuzz-testcase-minimized-ffmpeg_dem_MVI_fuzzer-4971479323246592 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fei Wang [Wed, 31 Mar 2021 02:07:44 +0000 (10:07 +0800)]
lavfi/qsvvpp: support async depth
Async depth will allow qsv filter cache few frames, and avoid force
switch and end filter task frame by frame. This change will improve
performance for some multi-task case, for example 1:N transcode(
decode + vpp + encode) with all QSV plugins.
Performance data test on my Coffee Lake Desktop(i7-8700K) by using
the following 1:8 transcode test case improvement:
1. Fps improved from 55 to 130.
2. Render/Video usage improved from ~61%/~38% to ~100%/~70%.(Data get
from intel_gpu_top)
Jan Ekström [Wed, 7 Apr 2021 18:17:04 +0000 (21:17 +0300)]
avformat/img2dec: set r_frame_rate in addition to avg_frame_rate
Apparently for various image sequences libavformat/utils.c can
calculate rather fancy r_frame_rate values, such as `186/1921`,
and since ffmpeg.c utilizes r_frame_rate for the filter chain
time base, this can quite deteriorate the output frame timing - even
though the user has requested the image sequence to be interpreted
at a specific, constant frame rate.