Hendrik Leppkes [Wed, 14 Oct 2015 21:20:09 +0000 (23:20 +0200)]
w32pthreads: fix mingw build on x86 with -msse2 or higher
When SSE2 or higher compiler optimizations are used, mingw uses
the _mm_mfence intrinsic for MemoryBarrier, however it doesn't include
the appropriate headers automatically.
wm4 [Sun, 11 Oct 2015 17:02:40 +0000 (19:02 +0200)]
avcodec/h264: remove redundant and bogus get_format call
The AVCodecContext.get_format callback is not only used for pixel format
negotiation with the API user, but also for hwaccel init. For the
latter, it's required that some codec parameters, in particular the
codec profile, are set when the callback is invoked.
This patch removes a get_format invocation where this is not guaranteed.
The codec parameters, including the profile, are really set further
below. (The same code path that sets the profile also calls get_format
properly too.)
This just happened to work by coincidence in most cases. For example, if
the API user just copied or reused the AVStream's AVCodecContext when
decoding, the profile would be set properly. But in some cases it
fails., such as with the sample WolfensteinTwitch.mp4 on the samples
server.
Remove the redundant get_format call. Apparently it serves no purpose
anymore, although it is possible that this was different at the time it
was added in commit ffd77f94a26be22b8ead3178ceec3ed39e68abc5.
This fixes hwaccel usage for API users which do not set the profile
when setting up the AVCodecContext (which is allowed).
_beginthreadex is for desktop only. CreateThread is available for windows store apps on windows (and phone) 8.1 and later. http://msdn.microsoft.com/en-us/library/ms682453%28VS.85%29.aspx
Anssi Hannula [Thu, 15 Oct 2015 10:42:38 +0000 (13:42 +0300)]
avformat/hls: fix segment selection regression on track changes of live streams
Commit ad701326b43078b90 ("avformat/hls: open playlists immediately when
AVDISCARD_ALL is dropped") inadvertently caused first_packet to never be
cleared, causing select_cur_seq_no() to not use the specific code for
live streams.
In practice this means that when the user selects a different audio
track during live stream (i.e. non-VOD) playback, there may be some
additional delay as the code might select an incorrect segment at first,
and we have to wait for video to catch audio (if too late segment was
selected) or to download more following audio segments (if too early
segment was selected).
Fix that by restoring the zeroing of first_packet.
The "loop" option is used in several demuxers (like img2dec) and muxers, using the same name in ffmpeg_opt
breaks them. Feel free to revert this and replace by any other solution or rename both as preferred
This is just as a quick fix to avoid the regression with existing command lines and to have both named
the same (which does not work)
Example:
./ffmpeg -loop 1 -i fate-suite/png1/lena-rgb24.png -t 1 test.avi
will produce 25 frames with the img2dec loop but only 1 frame at 25fps with the ffmpeg loop option
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It has already been demonstrated that the de Bruijn method has benefits
over the current implementation: commit 971d12b7f9d7be3ca8eb98e6c04ed521f83cbd3c.
That commit implemented it for long long, this extends it to the int version.
Tested with FATE.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
avfilter/all: propagate errors of functions from avfilter/formats
Many of the functions from avfilter/formats can return errors, usually AVERROR(ENOMEM).
This propagates the return values.
All of these were found by using av_warn_unused_result, demonstrating its utility.
Tested with FATE. I am least sure of the changes to avfilter/filtergraph,
since I don't know what/how reduce_format is intended to behave and how it should
react to errors.
Fixes: CID 1325680, 1325679, 1325678. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Previous version Reviewed-by: Nicolas George <george@nsup.org>
Previous version Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Andrey Utkin [Tue, 13 Oct 2015 09:44:37 +0000 (12:44 +0300)]
httpauth: Add space after commas in HTTP/RTSP auth header
This fixes access to Grandstream cameras, which return 401 otherwise.
VLC sends Authorization: header with spaces between parameters, and it
is known to work with Grandstream devices and broad range of other HTTP
and RTSP servers, so author considers switching to such behaviour safe.
Vittorio Giovara [Mon, 12 Oct 2015 16:54:52 +0000 (18:54 +0200)]
libschroedinger: Properly use AVFrame API
Rather than copying data buffers around, allocate a proper frame, and
use the standard AVFrame functions. This effectively makes the decoder
capable of direct rendering.
-duration is not a safe expression, since duration can be INT_MIN.
One might ask how it can become INT_MIN.
Although it is true that line 2574 is no longer reached with INT_MIN due
to commit 053e80f6eaf8d87521fe58ea96886b6ee0bbe59d (which fixed another
integer overflow issue), mov_update_dts_shift is called on line 3549 as
well, right after a read of untrusted data.
One can do the fix locally there, but that function is already a huge
mess. Changing mov_update_dts_shift is likely better.
This changes duration to INT_MIN + 1 in such cases. This should not make any
practical difference since such streams are anyway fuzzer files.
Tested with FATE.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Nedeljko Babic [Tue, 13 Oct 2015 14:14:51 +0000 (16:14 +0200)]
avcodec/mips/aaccoder_mips: Sync with the generic code
This patch fixes build of AAC encoder optimized for mips that was broken due
to some changes in generic code that were not propagated to the optimized code.
Also, some functions in the optimized code are basically duplicate of functions
from generic code. Since they do not bring enough improvement to the optimized
code to justify their existence, they are removed (which improves
maintainability of the optimized code).
Optimizations disabled in 97437bd are enabled again.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Ronald S. Bultje [Sat, 10 Oct 2015 02:35:49 +0000 (22:35 -0400)]
vp9: add itxfm_add eob shortcuts to 10/12bpp functions.
These aren't quite as helpful as the ones in 8bpp, since over there,
we can use pmulhrsw, but here the coefficients have too many bits to
be able to take advantage of pmulhrsw. However, we can still skip
cols for which all coefs are 0, and instead just zero the input data
for the row itx. This helps a few % on overall decoding speed.
Ronald S. Bultje [Mon, 12 Oct 2015 14:16:56 +0000 (10:16 -0400)]
vp9: initial attempt at a idct_idct_4x4 12bpp x86 simd (sse2) impl.
The trouble with this function is that intermediates overflow 31+sign
bits, so I've added some helpers (that will also be used in 10/12bpp
8x8, 16x16 and 32x32) to make that easier, basically emulating a half-
assed pmaddqd using 2xpmaddwd. It's currently sse2-only, if anyone sees
potential in adding ssse3, I'd love to hear it.
On 12 frames of a 444p 12 bits DNxHR sequence, _put function:
C: 78902 decicycles in idct, 262071 runs, 73 skips
avx: 32478 decicycles in idct, 262045 runs, 99 skips
Difference between the 2:
stddev: 0.39 PSNR:104.47 MAXDIFF: 2
This is unavoidable and due to the scale factors used in the x86
version, which cannot match the C ones.
In addition, the trick of adding an initial bias to the input of a
pass can overflow, as the input coefficients are already 15bits,
which is the maximum this function can handle.
Overall, however, the omse on 12 bits samples goes from 0.16916 to
0.16883. Reducing rowshift by 1 improves to 0.0908, but causes
overflows.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Modeled from the prores version. Clips to [0;1023] and is bitexact.
Bitexactness requires to add offsets in different places compared to
prores or C, and makes the function approximately 2% slower.
For 16 frames of a DNxHD 4:2:2 10bits test sequence:
C: 60861 decicycles in idct, 1048205 runs, 371 skips
sse2: 27567 decicycles in idct, 1048216 runs, 360 skips
avx: 26272 decicycles in idct, 1048171 runs, 405 skips
The add version is not implemented, so the corresponding dsp
function is set to NULL to make it clear in a code executing it.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When the input of a pass has 15 or 16 bits of precision (in particular
the column pass), the addition of a bias to W4 may lead to overflows
in the input to pmaddwd.
This requires postponing the adding of the bias to after the first
butterfly. To do so, the fact that m15, unused although zeroed, is
exploited. In case the pass is safe, an address can be directly used,
and the number of xmm regs can be decreased. Otherwise, the 32bits bias
is loaded into it.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>