]>
git.sesse.net Git - x264/log
Sebastian Dröge [Sun, 20 Dec 2015 20:49:35 +0000 (23:49 +0300)]
Fix AVC-Intra padding for non-Annex B encoding
Anton Mitrofanov [Mon, 11 Jan 2016 18:39:22 +0000 (21:39 +0300)]
ppc: Only perform AltiVec detection if compiled with AltiVec enabled
Anton Mitrofanov [Tue, 13 Oct 2015 12:30:16 +0000 (15:30 +0300)]
2-pass: Take into account possible frame reordering
Anton Mitrofanov [Tue, 13 Oct 2015 09:54:05 +0000 (12:54 +0300)]
Revise the 2-pass algorithm
Anton Mitrofanov [Mon, 4 Jan 2016 23:41:43 +0000 (02:41 +0300)]
Revise the row VBV algorithm (part 2)
Should fix rare cases of VBV emergency mode activation caused by too much trust
to the row predictors.
Henrik Gramner [Fri, 1 Jan 2016 11:44:31 +0000 (12:44 +0100)]
Bump dates to 2016
Henrik Gramner [Mon, 26 Oct 2015 18:54:20 +0000 (19:54 +0100)]
cli: Use memory-mapped input frames for yuv and y4m
Improves performance by avoiding extraneous memory copying.
Most beneficial on fast settings.
On average around 5-10% faster overall on ultrafast but the
performance improvement can be even larger in some cases.
Henrik Gramner [Thu, 7 Jan 2016 00:59:24 +0000 (01:59 +0100)]
y4m: Support extended frame headers when seeking
Use the actual length of the frame header of the first frame instead of
assuming a header without extensions when calculating the frame size.
Also makes the frame counter more accurate with extended frame headers.
Henrik Gramner [Tue, 3 Nov 2015 16:55:08 +0000 (17:55 +0100)]
configure: Simplify cygwin/mingw/msys code
Avoids some code duplication.
Also drop the -mno-cygwin check since that option was removed back in 2008.
Henrik Gramner [Mon, 26 Oct 2015 17:52:46 +0000 (18:52 +0100)]
y4m: Avoid some redundant strlen() calls
Henrik Gramner [Sun, 25 Oct 2015 16:15:10 +0000 (17:15 +0100)]
Simplify threadpool_wait
Henrik Gramner [Fri, 16 Oct 2015 17:05:34 +0000 (19:05 +0200)]
windows: Use native threads by default
--disable-win32thread can be passed as an argument to configure to compile
with pthreads, which was the old default behavior.
Henrik Gramner [Sun, 11 Oct 2015 20:32:11 +0000 (22:32 +0200)]
x86: Avoid some bypass delays and false dependencies
A bypass delay of 1-3 clock cycles may occur on some CPUs when transitioning
between int and float domains, so try to avoid that if possible.
Henrik Gramner [Sun, 11 Oct 2015 20:32:03 +0000 (22:32 +0200)]
x86: Enable high bit-depth x264_coeff_last64_avx2_lzcnt
The function existed but was never enabled.
Geza Lore [Mon, 12 Oct 2015 12:13:42 +0000 (13:13 +0100)]
x86inc: Add debug symbols indicating sizes of compiled functions
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
Henrik Gramner [Fri, 16 Oct 2015 19:28:49 +0000 (21:28 +0200)]
x86inc: Avoid creating unnecessary local labels
The REP_RET workaround is only needed on old AMD cpus, and the labels clutter
up the symbol table and confuse debugging/profiling tools, so use EQU to
create SHN_ABS symbols instead of creating local labels. Furthermore, skip
the workaround completely in functions that definitely won't run on such cpus.
This patch doesn't modify any emitted instructions, and doesn't actually affect
x264 at all. It's only for other projects that use x86inc.asm without an
appropriate `strip` command in their buildsystem.
Note that EQU is just creating a local label when using nasm instead of yasm.
This is probably a bug, but at least it doesn't break anything.
Henrik Gramner [Thu, 15 Oct 2015 15:42:49 +0000 (17:42 +0200)]
x86inc: Simplify AUTO_REP_RET
cpuflags is never undefined any more, it's set to 0 instead.
Also fix an incorrect comment.
Henrik Gramner [Mon, 12 Oct 2015 19:55:11 +0000 (21:55 +0200)]
x86inc: Use more consistent indentation
Henrik Gramner [Mon, 12 Oct 2015 18:15:18 +0000 (20:15 +0200)]
x86inc: Preserve arguments when allocating stack space
When allocating stack space with a larger alignment than the known stack
alignment a temporary register is used for storing the stack pointer.
Ensure that this isn't one of the registers used for passing arguments.
Henrik Gramner [Sat, 16 Jan 2016 23:25:47 +0000 (00:25 +0100)]
x86inc: Improve FMA instruction handling
* Correctly handle FMA instructions with memory operands.
* Print a warning if FMA instructions are used without the correct cpuflag.
* Simplify the instantiation code.
* Clarify documentation.
Only the last operand in FMA3 instructions can be a memory operand. When
converting FMA4 instructions to FMA3 instructions we can utilize the fact
that multiply is a commutative operation and reorder operands if necessary
to ensure that a memory operand is used only as the last operand.
Henrik Gramner [Sun, 11 Oct 2015 20:31:53 +0000 (22:31 +0200)]
x86inc: Be more verbose in assertion failures
Henrik Gramner [Wed, 30 Sep 2015 21:17:00 +0000 (23:17 +0200)]
x86inc: Make cpuflag() and notcpuflag() return 0 or 1
Makes it possible to use them in arithmetic expressions.
Henrik Gramner [Fri, 30 Oct 2015 15:55:49 +0000 (16:55 +0100)]
encoder_open: Fix memory leak
Furthermore, the x264_analyse_prepare_costs() and x264_analyse_init_costs()
functions were only used in x264_encoder_open(), so move that entire section
of code to analyse.c as well to simplify things.
Janne Grunau [Wed, 18 Nov 2015 10:08:22 +0000 (11:08 +0100)]
arm: do not fill mc_weight*_neon tabs for HIGH_BIT_DEPTH
The asm is only for 8-bit and function prototypes reflect that. Avoids
numerous warnings with --bit-depth=9/10.
Janne Grunau [Tue, 13 Oct 2015 21:50:11 +0000 (23:50 +0200)]
arm: Eliminate text relocations in asm
Android 6 does not link shared libraries with text relocations.
Make the movrel macro position independent and add movrelx for indirect
loads of external symbols.
Move the function pointer table for the aligned memcpy variants to the
data.rel.ro section on Linux/Android.
Martin Storsjö [Thu, 15 Oct 2015 08:50:33 +0000 (11:50 +0300)]
arm: Don't assume alignment in mbtree_propagate_list_internal where it isn't provided
Janne Grunau [Tue, 13 Oct 2015 21:50:12 +0000 (23:50 +0200)]
arm: Fix checkasm register clobber check on iOS
r9 is a volatile register in the iOS ABI and will therefore not be
preserved by compiled functions like the luma motion compensation.
Add the symbol prefix to the puts() call and use blx since a switch
between arm and thumb mode might be required.
Anton Mitrofanov [Wed, 30 Sep 2015 22:02:16 +0000 (01:02 +0300)]
ppc: Add detection of AltiVec support for FreeBSD
Patch from FreeBSD ports.
Anton Mitrofanov [Mon, 28 Sep 2015 18:07:55 +0000 (21:07 +0300)]
Don't assume 16-byte stack alignment by default on x86-32
Some compilers depending on target OS uses 4-byte stack alignment by default.
Explicitly check known good compilers and specific options for stack alignment.
Anton Mitrofanov [Tue, 22 Sep 2015 18:33:07 +0000 (21:33 +0300)]
Fix a few static analyzer performance hints
Anton Mitrofanov [Tue, 22 Sep 2015 17:19:23 +0000 (20:19 +0300)]
Revise the row VBV algorithm
Anton Mitrofanov [Tue, 22 Sep 2015 16:26:25 +0000 (19:26 +0300)]
Fix high bit depth lookahead cost compensation algorithm
Now high bit depth VBV should act more like 8-bit depth one.
Anton Mitrofanov [Tue, 22 Sep 2015 16:05:52 +0000 (19:05 +0300)]
Correctly update the intra row predictor in B-frames
It was previously used but never updated from it's initialization value.
Anton Mitrofanov [Tue, 22 Sep 2015 15:58:24 +0000 (18:58 +0300)]
Change the predictors update algorithm
Keep predictor offsets more stable. This should fix VBV misprediction in frames
with a large difference in complexity between the top and bottom parts.
Martin Storsjö [Thu, 3 Sep 2015 06:30:44 +0000 (09:30 +0300)]
arm: Implement x264_mbtree_propagate_{cost, list}_neon
The cost function could be simplified to avoid having to clobber
q4/q5, but this requires reordering instructions which increase
the total runtime.
checkasm timing Cortex-A7 A8 A9
mbtree_propagate_cost_c 63702 155835 62829
mbtree_propagate_cost_neon 17199 10454 11106
mbtree_propagate_list_c 104203 108949 84532
mbtree_propagate_list_neon 82035 78348 60410
Martin Storsjö [Thu, 3 Sep 2015 06:30:43 +0000 (09:30 +0300)]
x86: Share the mbtree_propagate_list macro with aarch64
This avoids having to duplicate the same code for all architectures
that implement only the internal part of this function in assembler.
Martin Storsjö [Wed, 2 Sep 2015 19:39:51 +0000 (22:39 +0300)]
arm: Implement luma intra deblocking
checkasm timing Cortex-A7 A8 A9
deblock_luma_intra[0]_c 5988 4653 4316
deblock_luma_intra[0]_neon 3103 2170 2128
deblock_luma_intra[1]_c 7119 5905 5347
deblock_luma_intra[1]_neon 2068 1381 1412
This includes extra optimizations by Janne Grunau.
Timings from a separate build, on Exynos 5422:
Cortex-A7 A15
deblock_luma_intra[0]_c 6627 3300
deblock_luma_intra[0]_neon 3059 1128
deblock_luma_intra[1]_c 7314 4128
deblock_luma_intra[1]_neon 2038 720
Martin Storsjö [Mon, 31 Aug 2015 19:40:31 +0000 (22:40 +0300)]
arm: Implement some neon 8x16c intra predict functions
checkasm timing Cortex-A7 A8 A9
intra_predict_8x16c_dct_c 862 540 590
intra_predict_8x16c_dct_neon 608 511 657
intra_predict_8x16c_h_c 972 707 719
intra_predict_8x16c_h_neon 722 656 672
intra_predict_8x16c_p_c 10183 9819 8655
intra_predict_8x16c_p_neon 2622 1972 1983
Martin Storsjö [Thu, 27 Aug 2015 21:15:01 +0000 (00:15 +0300)]
arm: Implement x264_plane_copy_neon
checkasm timing Cortex-A7 A8 A9
plane_copy_c 13124 10925 9106
plane_copy_neon 7349 5103 8945
Martin Storsjö [Fri, 28 Aug 2015 06:40:24 +0000 (09:40 +0300)]
checkasm: arm: Check register clobbering
Cast the function pointer to a different type signature, to
be able to use uint64_t as return type (instead of intptr_t) for
those calls that require it.
Use two separate functions, depending on whether neon is available.
Martin Storsjö [Thu, 13 Aug 2015 21:00:57 +0000 (00:00 +0300)]
checkasm: Try different widths for ssd_nv12
To test all codepaths in the aarch64 neon implementation, one at
the very least needs to test with width 8, 16, 24 and 32.
Jerome Duval [Fri, 13 Jun 2014 19:56:27 +0000 (19:56 +0000)]
Haiku support
Add Haiku as supported platform in configure.
Haiku has no nice() function, use the platform specific substitute instead.
Martin Storsjö [Tue, 25 Aug 2015 11:38:20 +0000 (14:38 +0300)]
checkasm: aarch64: Check register clobbering
Disable this on iOS, since it has got a slightly different ABI
for vararg parameters.
Martin Storsjö [Tue, 25 Aug 2015 20:36:45 +0000 (23:36 +0300)]
arm: Implement x284_decimate_score15/16/64_neon
checkasm timing Cortex-A7 A8 A9
decimate_score15_c 764 736 535
decimate_score15_neon 487 494 453
decimate_score16_c 782 727 553
decimate_score16_neon 487 494 521
decimate_score64_c 2361 2597 2011
decimate_score64_neon 1017 802 785
Martin Storsjö [Tue, 25 Aug 2015 20:36:44 +0000 (23:36 +0300)]
arm: Implement chroma intra deblock
checkasm timing Cortex-A7 A8 A9
deblock_chroma_420_intra_mbaff_c 1469 1276 1181
deblock_chroma_420_intra_mbaff_neon 981 717 644
deblock_chroma_intra[1]_c 2954 2402 2321
deblock_chroma_intra[1]_neon 947 581 575
deblock_h_chroma_420_intra_c 2859 2509 2264
deblock_h_chroma_420_intra_neon 1480 1119 1028
deblock_h_chroma_422_intra_c 6211 5030 4792
deblock_h_chroma_422_intra_neon 2894 1990 2077
Martin Storsjö [Tue, 25 Aug 2015 11:38:17 +0000 (14:38 +0300)]
arm: Implement x264_pixel_sa8d_satd_16x16_neon
This requires spilling some registers to the stack,
contray to the aarch64 version.
checkasm timing Cortex-A7 A8 A9
sa8d_satd_16x16_neon 12936 6365 7492
sa8d_satd_16x16_separate_neon 14841 6605 8324
Martin Storsjö [Tue, 25 Aug 2015 11:38:16 +0000 (14:38 +0300)]
arm: Implement x264_deblock_h_chroma_mbaff_neon
checkasm timing Cortex-A7 A8 A9
deblock_chroma_420_mbaff_c 1944 1706 1526
deblock_chroma_420_mbaff_neon 1210 873 865
Martin Storsjö [Tue, 25 Aug 2015 11:38:15 +0000 (14:38 +0300)]
arm: Implement x264_deblock_h_chroma_422_neon
checkasm timing Cortex-A7 A8 A9
deblock_h_chroma_422_c 6953 6269 5145
deblock_h_chroma_422_neon 3905 2569 2551
Martin Storsjö [Tue, 25 Aug 2015 11:38:14 +0000 (14:38 +0300)]
arm: Implement integral_init4/8h/v_neon
checkasm timing Cortex-A7 A8 A9
integral_init4h_c 10466 8590 6161
integral_init4h_neon 3021 1494 1800
integral_init4v_c 16250 13590 13628
integral_init4v_neon 3473 2073 3291
integral_init8h_c 10100 8275 5705
integral_init8h_neon 4403 2344 2751
integral_init8v_c 6403 4632 4999
integral_init8v_neon 1184 783 1306
Martin Storsjö [Tue, 25 Aug 2015 11:38:13 +0000 (14:38 +0300)]
arm: Implement x264_denoise_dct_neon
checkasm timing Cortex-A7 A8 A9
denoise_dct_c 6604 5510 5858
denoise_dct_neon 1774 1139 1614
Martin Storsjö [Tue, 25 Aug 2015 11:38:12 +0000 (14:38 +0300)]
arm: Add x264_nal_escape_neon
checkasm timing Cortex-A7 A8 A9
nal_escape_c 852758 879566 655497
nal_escape_neon 376831 450678 371673
Martin Storsjö [Tue, 25 Aug 2015 11:38:11 +0000 (14:38 +0300)]
arm: Add neon versions of vsad, asd8 and ssd_nv12_core
These are straight translations of the aarch64 versions.
checkasm timing Cortex-A7 A8 A9
vsad_c 16234 10984 9850
vsad_neon 2132 1020 789
asd8_c 5859 3561 3543
asd8_neon 1407 1279 1250
ssd_nv12_c 608096 591072 426285
ssd_nv12_neon 72752 33549 41347
Martin Storsjö [Tue, 25 Aug 2015 11:38:10 +0000 (14:38 +0300)]
checkasm: Check the right output range for integral_initXh
These functions write their output into sum+stride, while we previously
only checked [0..stride-8] within the sum array.
This catches the previously broken aarch64 version of these functions.
Also check up until stride-4 elements for init4h.
Janne Grunau [Thu, 20 Aug 2015 11:55:54 +0000 (13:55 +0200)]
aarch64: Skip deblocking in 264_deblock_h_chroma_422_neon
If the parameters (alpha, beta, tc0[]) indicated that the deblocking
should have been skipped, every 2nd chrome line would have deblocked
anyway.
deblock_h_chroma_422_neon: 2259 (before)
deblock_h_chroma_422_neon: 2192 (after)
Janne Grunau [Mon, 17 Aug 2015 14:39:20 +0000 (16:39 +0200)]
aarch64: Optimize various intra_predict asm functions
Make them at least as fast as the compiled C version (tested on
cortex-a53 vs. gcc 4.9.2).
C NEON (before) NEON (after)
intra_predict_4x4_dc: 260 335 260
intra_predict_4x4_dct: 210 265 200
intra_predict_8x8c_dc: 497 548 493
intra_predict_8x8c_v: 232 309 179 (arm64)
intra_predict_8x16c_dc: 795 830 790
Janne Grunau [Tue, 18 Aug 2015 08:25:10 +0000 (10:25 +0200)]
aarch64: Faster intra_predict_4x4_h
Use multiplication with 0x01010101 for splats.
On a cortex-a53:
gcc 4.9.2 llvm 3.6 neon (before) neon (after)
intra_predict_4x4_h: 162 147 160/155 139/135
Janne Grunau [Tue, 18 Aug 2015 08:25:09 +0000 (10:25 +0200)]
aarch64: Fix coeff_level_run* macros with LLVM's assembler
LLVM's integrated assembler does not treat symbols as integer constants.
Janne Grunau [Tue, 18 Aug 2015 08:25:08 +0000 (10:25 +0200)]
aarch64: Remove commas LLVM's assembler complains about
Martin Storsjö [Thu, 13 Aug 2015 20:59:31 +0000 (23:59 +0300)]
arm: Implement x264_sub8x16_dct_dc_neon
checkasm timing Cortex-A7 A8 A9
sub8x16_dct_dc_c 6386 3901 4080
sub8x16_dct_dc_neon 1491 698 917
Martin Storsjö [Thu, 13 Aug 2015 20:59:28 +0000 (23:59 +0300)]
arm: Optimize x264_deblock_h_chroma_neon
Shuffle both chroma components together as a 16 bit unit, and
don't write the unchanged columns (like in x264_deblock_h_luma_neon
and in the aarch64 version of the function).
This causes a minor slowdown for x264_deblock_v_chroma_neon, but
it is negligible compared to the speedup.
checkasm timing Cortex-A7 A8 A9
deblock_chroma[1]_c 4817 4057 3601
deblock_chroma[1]_neon 1249 716 817 (before)
deblock_chroma[1]_neon 1249 766 845 (after)
deblock_h_chroma_420_c 3699 3275 2830
deblock_h_chroma_420_neon 2068 1414 1400 (before)
deblock_h_chroma_420_neon 1838 1355 1291 (after)
Martin Storsjö [Thu, 13 Aug 2015 20:59:27 +0000 (23:59 +0300)]
aarch64: Remove leftover commented out code
Martin Storsjö [Thu, 13 Aug 2015 20:59:26 +0000 (23:59 +0300)]
aarch64: Simplify the decimate_score functions
After doing a left shift by the number of bits returned by clz,
only bits set to zero can be shifted out, so if the register
was nonzero to start with (which is checked), it can't become
zero here.
Martin Storsjö [Thu, 13 Aug 2015 20:59:25 +0000 (23:59 +0300)]
arm: Use aligned loads in x264_coeff_last15_neon
After subtracting 2, the pointer will be aligned.
checkasm timing Cortex-A7 A8 A9
coeff_last15_c 423 375 230
coeff_last15_neon 350 420 404 (before)
coeff_last15_neon 350 400 394 (after)
Martin Storsjö [Thu, 13 Aug 2015 20:59:24 +0000 (23:59 +0300)]
arm: Simplify x264_predict_8x8c_p_neon
This gets rid of a few unnecessary (and confusing) steps in
calculating the increment to i00.
checkasm timing Cortex-A7 A8 A9
intra_predict_8x8c_p_c 5525 4732 4755
intra_predict_8x8c_p_neon 1719 1140 1262 (before)
intra_predict_8x8c_p_neon 1663 1142 1255 (after)
Vittorio Giovara [Tue, 15 Sep 2015 13:40:14 +0000 (15:40 +0200)]
lavf: Use the prefixed name for pixel format enum
Janne Grunau [Wed, 2 Sep 2015 22:21:58 +0000 (00:21 +0200)]
aarch64: fix x264_mbtree_propagate_cost_neon
The branch conditon caused the loop to execute one time more than
intended. Detected by a memory corruption on arm with the 1 to 1 port of
the function.
Martin Storsjö [Thu, 13 Aug 2015 20:59:22 +0000 (23:59 +0300)]
aarch64: Fix integral_init4/8h_neon
The stride is the number of uint16_t elements and thus needs
to be shifted.
This issue had slipped unnoticed since checkasm didn't actually
verify the output of these functions.
Henrik Gramner [Thu, 27 Aug 2015 17:53:00 +0000 (19:53 +0200)]
x86: Fix integral_init4/8h_avx2
The AVX2 implementation was using the wrong offsets. It went undetected due to
the checkasm test being incorrect.
Mark Webster [Wed, 5 Aug 2015 03:28:17 +0000 (04:28 +0100)]
Simplify inclusion of x264.h in C++ projects
Name all structs to support forward declarations.
Add a conditional extern "C" wrapper in x264.h itself instead of having to
specify it in every location where it's included.
Henrik Gramner [Sun, 16 Aug 2015 19:59:26 +0000 (21:59 +0200)]
checkasm: Properly save rdx/edx in checkasm_call() on x86
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.
Doesn't affect any of the existing checkasm tests but it's more correct
behavior and it might be useful in the future.
Henrik Gramner [Tue, 11 Aug 2015 15:19:35 +0000 (17:19 +0200)]
x86: Enable SSE2 by default on x86-32
It makes more sense to tune the defaults to benefit the vast majority of users.
Anyone still using a Pentium III for video encoding is of course free to
explicitly set different flags when compiling.
Henrik Gramner [Mon, 10 Aug 2015 20:30:21 +0000 (22:30 +0200)]
msvs/icl: Improve default CFLAGS
Use -fp:fast as a substitute for -ffast-math.
Increase warning level from -W0 to -W1 (the default setting).
Disable -GS (stack cookies) on MSVS. It's disabled by default on ICL.
Henrik Gramner [Wed, 12 Aug 2015 20:23:31 +0000 (22:23 +0200)]
Use a relative $SRCPATH for out-of-tree builds
Fixes out-of-tree MSVS builds on Cygwin.
Henrik Gramner [Sat, 8 Aug 2015 20:26:38 +0000 (22:26 +0200)]
cygwin: Enable MSVS support
`cl -showIncludes` creates absolute Windows paths for some files, attempt
to convert those to Unix paths.
Use relative paths for dependencies located in or below the working directory
in order to mimic the behavior of gcc and to make the paths more readable.
Make the dependency generation script a bit more robust in general.
Henrik Gramner [Sat, 8 Aug 2015 16:34:21 +0000 (18:34 +0200)]
cltostr.sh: Minor fixes
Henrik Gramner [Sat, 8 Aug 2015 10:21:54 +0000 (12:21 +0200)]
Simplify version.sh
Also remove some non-POSIX syntax and improve robustness.
As a bonus the script now runs about 2-3 times faster.
`git rev-list --count` could be used to simplify things even further,
but that functionality was added in git 1.7.2 so keep `wc -l` for now
to maintain compatibility with older git versions.
장영훈 [Fri, 7 Aug 2015 05:43:24 +0000 (14:43 +0900)]
msvs: Fix cl detection in non-English environments
Henrik Gramner [Mon, 3 Aug 2015 19:05:11 +0000 (21:05 +0200)]
x86inc: Sync minor changes from ffmpeg/libav
Henrik Gramner [Wed, 29 Jul 2015 17:30:52 +0000 (19:30 +0200)]
matroska: Add comments for the remaining element names
Henrik Gramner [Wed, 29 Jul 2015 17:30:41 +0000 (19:30 +0200)]
Silence various static analyzer warnings
Those are false positives, but it doesn't hurt to get rid of them.
Henrik Gramner [Sun, 26 Jul 2015 21:13:29 +0000 (23:13 +0200)]
mingw: Enable the tsaware linker flag
Avoids an irrelevant compatibility layer in Terminal Services environments.
https://msdn.microsoft.com/en-us/library/
cc834995 .aspx
Henrik Gramner [Sun, 26 Jul 2015 21:13:26 +0000 (23:13 +0200)]
msvs: Don't redefine snprintf for VS2015
Visual Studio 2015 has a proper snprintf implementation.
Henrik Gramner [Sun, 26 Jul 2015 21:13:19 +0000 (23:13 +0200)]
msvs: Prefer link.exe from the same directory as cl.exe
/usr/bin/link from coreutils may be located before the MSVS linker in $PATH
which causes linking to fail due to using the wrong binary.
Henrik Gramner [Sun, 26 Jul 2015 22:10:00 +0000 (00:10 +0200)]
frame_dump: check fseek() return value
Henrik Gramner [Sun, 26 Jul 2015 22:08:38 +0000 (00:08 +0200)]
x264_vfprintf: use va_copy
It's undefined behavior to use the same va_list twice.
This most likely didn't cause any issues in practice since the string would
have to be larger than 4 KiB to trigger the fallback path.
Use workaround for ICL as it doesn't define va_copy even for C99.
Henrik Gramner [Sun, 26 Jul 2015 22:08:31 +0000 (00:08 +0200)]
param_parse: Fix framerate rounding issues
Marcin Juszkiewicz [Mon, 1 Jun 2015 09:24:45 +0000 (11:24 +0200)]
aarch64: Remove broken CFLAGS in configure
GCC doesn't have an "-arch" switch, but works when that entire line is removed.
Rong Yan [Mon, 20 Jul 2015 08:34:20 +0000 (03:34 -0500)]
ppc: Add little-endian PowerPC support
Rishikesh More [Thu, 18 Jun 2015 12:18:46 +0000 (17:48 +0530)]
mips: MSA quant optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:45 +0000 (17:48 +0530)]
mips: MSA predict optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:44 +0000 (17:48 +0530)]
mips: MSA pixel optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:43 +0000 (17:48 +0530)]
mips: MSA deblock optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:42 +0000 (17:48 +0530)]
mips: MSA dct optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:40 +0000 (17:48 +0530)]
mips: MSA mc optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:38 +0000 (17:48 +0530)]
mips: Common MSA macros
Add macros for load/store, slide, shift, transpose and basic arithmetic
operations required by subsequent patches.
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Tue, 12 May 2015 14:08:09 +0000 (19:38 +0530)]
mips: Add MSA support to checkasm
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Kaustubh Raste [Fri, 17 Apr 2015 12:08:58 +0000 (17:38 +0530)]
mips: Initial MSA support
MSA is the MIPS SIMD Architecture.
Add X264_CPU_MSA define.
Update configure to detect MIPS platform and set flags.
CPU-specific gcc options are expected through --extra-cflags.
Sample command line for mips32r5:
./configure --host=mipsel-linux-gnu --cross-prefix=<TOOLCHAIN>/mips-mti-linux-gnu-
--extra-cflags="-EL -mips32r5 -msched-weight -mload-store-pairs"
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Anton Mitrofanov [Thu, 16 Jul 2015 21:22:29 +0000 (00:22 +0300)]
Limit autodetection of threads number according to the source height
Anton Mitrofanov [Thu, 16 Jul 2015 16:04:59 +0000 (19:04 +0300)]
Fine-tune of frame's size predictors at ratecontrol start
This is attempt to improve VBV at start of video with a lot of threads which
delay feedback for predictors.
Anton Mitrofanov [Thu, 16 Jul 2015 13:15:56 +0000 (16:15 +0300)]
Use forced frame types in slicetype analysis
This should improve MBTree and VBV when a lot of forced frame types are used.