Anton Khirnov [Wed, 26 Oct 2016 11:59:15 +0000 (13:59 +0200)]
decode: restructure the core decoding code
Currently, the new decoding API is pretty much just a wrapper around the
old deprecated one. This is problematic, since it interferes with making
full use of the flexibility added by the new API. The old API should
also be removed at some future point.
Reorganize the code so that the new send_packet/receive_frame functions
call the actual decoding directly and change the old deprecated
avcodec_decode_* functions into wrappers around the new API.
The new internal API for decoders is now changing as well. Before this
commit, it mirrors the public API, so the decoders need to implement
send_packet() and receive_frame() callbacks. This turns out to require
awkward constructs in both the decoders and the generic code. After this
commit, the decoders only implement the receive_frame() callback and
call a new internal function, ff_decode_get_packet() to obtain input
data, in the same manner to how the bitstream filters now work.
avcodec will now always make a reference to the input packet, which means
that non-refcounted input packets will be copied. Keeping the previous
behaviour, where this copy could sometimes be avoided, would make the
code significantly more complex and fragile for only dubious gains,
since packets are typically small and everyone who cares about
performance should use refcounted packets anyway.
Anton Khirnov [Wed, 26 Oct 2016 11:41:12 +0000 (13:41 +0200)]
decode: be more explicit about storing the last packet properties
The current code stores a pointer to the packet passed to the decoder,
which is then used during get_buffer() for timestamps and side data
passthrough. However, since this is a pointer to user data which we do
not own, storing it is potentially dangerous. It is also ill defined for
the new decoding API with split input/output.
Fix this problem by making an explicit internally owned copy of the
packet properties.
Anton Khirnov [Tue, 24 May 2016 13:09:29 +0000 (15:09 +0200)]
lavc: add a null bitstream filter
It is useful for testing/debugging and will also be used as the default
filter in the following commit adding pre-decode filtering to avoid
having a separate non-filtered codepath.
Wan-Teh Chang [Sat, 3 Dec 2016 00:56:16 +0000 (16:56 -0800)]
compat/atomics: add typecasts in atomic_compare_exchange_strong()
The Solaris and Windows emulations of atomic_compare_exchange_strong()
need typecasts to avoid compiler warnings, because the functions they
call expect a void* pointer but an intptr_t integer is passed.
Note that the emulations of atomic_compare_exchange_strong() (except
the gcc version) only work for atomic_intptr_t because of the type of
the second argument (|expected|). See
http://en.cppreference.com/w/c/atomic:
_Bool atomic_compare_exchange_strong( volatile A* obj,
C* expected, C desired );
The types of the first argument and second argument are different
(|A| and |C|, respectively). |C| is the non-atomic type corresponding
to |A|. In the emulations of atomic_compare_exchange_strong(), |C| is
intptr_t. This implies |A| can only be sig_intptr_t.
Wan-Teh Chang [Thu, 8 Dec 2016 00:16:02 +0000 (16:16 -0800)]
avutil: fix data race in av_get_cpu_flags()
Make the one-time initialization in av_get_cpu_flags() thread-safe. The
static variables |flags|, |cpuflags_mask|, and |checked| in
libavutil/cpu.c are read and written using normal load and store
operations. These are considered as data races. The fix is to use atomic
load and store operations.
Remove the |checked| variable because the invalid value of -1 for
|flags| can be used to indicate the same condition. Rename |flags| to
|cpu_flags| and move it to file scope.
The fix can be verified by running the libavutil/tests/cpu_init.c test
program under ThreadSanitizer:
./configure --toolchain=clang-tsan
make libavutil/tests/cpu_init
libavutil/tests/cpu_init
There should be no warnings from ThreadSanitizer.
Co-author: Dmitry Vyukov of Google, who suggested the data race fix.
lavu: Add AVSphericalMapping type and frame side data
While no decoder currently exports spherical information, this type
represents a frame property that has to be passed through from container
to frames.
Wan-Teh Chang [Fri, 2 Dec 2016 19:27:17 +0000 (11:27 -0800)]
configure: add -fPIE instead of -pie to C flags for ThreadSanitizer
-pie was added to C flags for ThreadSanitizer in commit 19f251a2882a8d0779b432e63bf282e4d9c443bb. Under clang 3.8.0, the -pie
flag causes a compiler warning and a linker error when running configure
--toolchain=clang-tsan. Here is an excerpt from config.log:
clang ... -fsanitize=thread -pie -std=c11 -fomit-frame-pointer -pthread -c -o /tmp/ffconf.A8SsaoCF.o /tmp/ffconf.JdpujQlD.c
clang: warning: argument unused during compilation: '-pie'
clang -fsanitize=thread -pie -Wl,--as-needed -o /tmp/ffconf.2iYA4bsw /tmp/ffconf.A8SsaoCF.o -lm -lm -lbz2 -lz -pthread
/usr/bin/ld: /tmp/ffconf.A8SsaoCF.o: relocation R_X86_64_PC32 against undefined symbol `atan2f@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
clang: error: linker command failed with exit code 1 (use -v to see invocation)
To be conservative, I changed -pie to -fPIE. But the documentation seems
to imply just -fsanitize=thread is enough:
Diego Biurrun [Mon, 28 Nov 2016 07:55:31 +0000 (08:55 +0100)]
configure: Simplify and fix avfoundation indev handling
Handle extralibs in the standard way, add missing pthreads dependency.
Also globally check for -fobj-arc with Objective-C compilers since
that option is useful for other Objective-C code as well.
Diego Biurrun [Tue, 29 Nov 2016 14:09:35 +0000 (15:09 +0100)]
Remove Plan 9 support
Supporting the system was a nice joke for the 9 release, but it has
run its course. Nowadays Plan 9 receives no testing and has no
practical usefulness.
I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.
I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.
In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.