]> git.sesse.net Git - x264/log
x264
15 years agoCache ref costs and use more accurate MV costs
Fiona Glaser [Wed, 14 Jan 2009 02:11:50 +0000 (21:11 -0500)]
Cache ref costs and use more accurate MV costs
New MV costs should improve quality slightly by improving the smoothness of the field of MV costs (and they're closer to CABAC's actual costs).
Despite being optimized for CABAC, they still help under CAVLC, albeit less.
MV cost change by Loren Merritt

15 years agoSupport forced frametypes with scenecut/b-adapt
Fiona Glaser [Wed, 14 Jan 2009 01:22:36 +0000 (20:22 -0500)]
Support forced frametypes with scenecut/b-adapt
This allows an input qpfile to be used to force I-frames, for example.
The same can be done through the library interface.
Document the format of the qpfile in --longhelp and the forcing of frametypes in x264.h
Note that forcing B-frames and B-refs may not always have the intended result.
Patch partially by Steven Walters <kemuri9@gmail.com>.

15 years agoRemove an IDIV from i8x8 analysis
Fiona Glaser [Wed, 14 Jan 2009 00:58:44 +0000 (19:58 -0500)]
Remove an IDIV from i8x8 analysis
Only one IDIV is left in macroblock level code (transform_rd)

15 years agoFix regression in r1066
Fiona Glaser [Thu, 8 Jan 2009 20:07:16 +0000 (15:07 -0500)]
Fix regression in r1066
With some combinations of video width and other settings, the scratch buffer was slightly too small.
This caused heap corruption on some systems.
Also prevent merange from being raised during encoding with esa/tesa through encoder_reconfig, as this no longer works.

15 years agoDisable B-frames in lossless mode
Fiona Glaser [Tue, 6 Jan 2009 21:55:44 +0000 (16:55 -0500)]
Disable B-frames in lossless mode
They hurt compression anyways, and direct auto was bugged with lossless.

15 years agoFactorize in ppccommon.h the conditional inclusion of altivec.h on Linux systems.
Brad Smith [Mon, 5 Jan 2009 22:53:11 +0000 (22:53 +0000)]
Factorize in ppccommon.h the conditional inclusion of altivec.h on Linux systems.

15 years agoDisable __builtin_clz() intrinsic on gcc versions prior to 3.4.
Brad Smith [Mon, 5 Jan 2009 20:58:32 +0000 (15:58 -0500)]
Disable __builtin_clz() intrinsic on gcc versions prior to 3.4.
The function did not exist before that version.

15 years agoSmall tweaks to coeff asm
Fiona Glaser [Fri, 2 Jan 2009 02:44:00 +0000 (21:44 -0500)]
Small tweaks to coeff asm
Factor out a few redundant pxors
Related cosmetics

15 years agoUse the correct strtok under MSVC
Steven Walters [Wed, 31 Dec 2008 03:20:37 +0000 (22:20 -0500)]
Use the correct strtok under MSVC
Also change one malloc -> x264_malloc

15 years agoAdd stack alignment for lookahead functions
Fiona Glaser [Wed, 31 Dec 2008 03:14:45 +0000 (22:14 -0500)]
Add stack alignment for lookahead functions
Should allow libx264 to be called from non-gcc-compiled applications without adding force_align_arg_pointer.

15 years agoAdd support for SSE4a (Phenom) LZCNT instruction
Fiona Glaser [Wed, 31 Dec 2008 01:47:45 +0000 (20:47 -0500)]
Add support for SSE4a (Phenom) LZCNT instruction
Significantly speeds up coeff_last and coeff_level_run on Phenom CPUs for faster CAVLC and CABAC.
Also a small tweak to coeff_level_run asm.

15 years agofactor mallocs out of hpel, ssim, and esa.
Steven Walters [Mon, 29 Dec 2008 05:14:26 +0000 (05:14 +0000)]
factor mallocs out of hpel, ssim, and esa.
there should now be no memory allocation outside of init-time.

15 years agoMuch faster CAVLC RDO and bitstream writing
Fiona Glaser [Tue, 30 Dec 2008 03:12:17 +0000 (03:12 +0000)]
Much faster CAVLC RDO and bitstream writing
Pure asm version of level/run coding.  Over 2x faster than C.
Up to 40% faster CAVLC RDO.  Overall benefit up to ~7.5% with RDO or ~5% with fast encoding settings.

15 years agoCosmetics: cleaner syntax for defining temporary registers in asm
Loren Merritt [Tue, 30 Dec 2008 02:52:25 +0000 (21:52 -0500)]
Cosmetics: cleaner syntax for defining temporary registers in asm
Globally define t#[qdwb], so that only t# needs to be locally defined when reorganizing registers

15 years agoMuch faster CABAC RDO
Fiona Glaser [Sun, 28 Dec 2008 02:36:14 +0000 (21:36 -0500)]
Much faster CABAC RDO
Since RDO doesn't care about what order bit costs are calculated, merge sigmap and level coding into the same loop in RDO.
This is bit-exact for 4x4dct but slightly incorrect for 8x8dct due to the sigmap containing duplicated contexts.
However, the PSNR penalty of this is extremely small (~0.001db).
Speed benefit is about 15% in 4x4dct and 30% in 8x8dct residual bit cost calculation at QP20.
Overall encoding speed benefit is up to 5%, depending on encoding settings.
Also remove an old unnecessary CABAC table that hasn't been used for years.

15 years agoVLC table optimizations
Fiona Glaser [Fri, 26 Dec 2008 12:35:49 +0000 (07:35 -0500)]
VLC table optimizations
Slightly reorganize VLC tables for ~2% faster block_residual_write_cavlc.
Also a small optimization in p8x8 CAVLC.

15 years agoFix crash in --me esa/tesa introduced in r1058
Loren Merritt [Thu, 25 Dec 2008 03:58:17 +0000 (22:58 -0500)]
Fix crash in --me esa/tesa introduced in r1058
Also suppress the last mingw warning message

15 years agoOptimize variance asm + minor changes
Fiona Glaser [Wed, 24 Dec 2008 03:33:28 +0000 (22:33 -0500)]
Optimize variance asm + minor changes
Remove SAD argument from var, not needed anymore.
Speed up var asm a bit by eliminating psadbw and instead HADDWing at end.
Eliminate all remaining warnings on gcc 3.4 on cygwin
Port another minor optimization from lavc (pskip)

15 years agoMinor CABAC cleanups and related optimizations
Fiona Glaser [Tue, 23 Dec 2008 23:31:48 +0000 (18:31 -0500)]
Minor CABAC cleanups and related optimizations
Merge the two list tables to allow cleaner MC/CABAC/CAVLC code
Remove lots of unnecessary {s
Port some very minor opts from lavc

15 years agofaster ESA init
Loren Merritt [Thu, 11 Dec 2008 19:47:17 +0000 (19:47 +0000)]
faster ESA init
reduce memory if using ESA and not p4x4

15 years agoMore macroblock_cache optimizations
Fiona Glaser [Tue, 16 Dec 2008 07:02:49 +0000 (23:02 -0800)]
More macroblock_cache optimizations
Patch partially by Loren Merritt

15 years agoFaster macroblock_cache_rect
Fiona Glaser [Mon, 15 Dec 2008 21:15:29 +0000 (13:15 -0800)]
Faster macroblock_cache_rect
Explicit loop unrolling

15 years agoOptimizations in predict_mv_direct
Fiona Glaser [Mon, 15 Dec 2008 02:30:51 +0000 (18:30 -0800)]
Optimizations in predict_mv_direct
Add some early terminations and minor optimizations
This change may also fix the extremely rare direct+threading MV bug.

15 years agoFix visual corruption when picture width was not mod 32.
David Wolstencroft [Sun, 14 Dec 2008 10:47:28 +0000 (10:47 +0000)]
Fix visual corruption when picture width was not mod 32.
The previous Altivec implemention of mc_chroma assumed that i_src_stride was always mod 16.

15 years agoAdd support for FSF GCC version >= 4.3 on OSX.
Guillaume Poirier [Mon, 8 Dec 2008 20:11:45 +0000 (21:11 +0100)]
Add support for FSF GCC version >= 4.3 on OSX.
So far, only Apple GCC version was supported.

15 years agoMore accurate refcost for p8x8 CAVLC
Fiona Glaser [Fri, 12 Dec 2008 01:31:52 +0000 (17:31 -0800)]
More accurate refcost for p8x8 CAVLC
Slightly better quality, especially in non-RD mode, with CAVLC.

15 years agouse lookup tables instead of actual exp/pow for AQ
Loren Merritt [Thu, 11 Dec 2008 04:54:17 +0000 (20:54 -0800)]
use lookup tables instead of actual exp/pow for AQ
Significant speed boost, especially on CPUs with atrociously slow floating point units (e.g. Pentium 4 saves 800 clocks per MB with this change).
Add x264_clz function as part of the LUT system: this may be useful later.
Note this changes output somewhat as the numbers from the lookup table are not exact.

15 years agoSuppress saveptr warnings on Windows GCC
Fiona Glaser [Thu, 11 Dec 2008 04:53:13 +0000 (20:53 -0800)]
Suppress saveptr warnings on Windows GCC

15 years agoMore small speed tweaks to macroblock.c
Fiona Glaser [Thu, 11 Dec 2008 04:52:06 +0000 (20:52 -0800)]
More small speed tweaks to macroblock.c

15 years agoMuch faster CAVLC residual coding
Fiona Glaser [Mon, 8 Dec 2008 21:44:23 +0000 (13:44 -0800)]
Much faster CAVLC residual coding
Use a VLC table for common levelcodes instead of constructing them on-the-spot
Branchless version of i_trailing calculation (2x faster on Nehalem)
Completely remove array_non_zero_count and instead use the count calculated in level/run coding.  Note: this slightly changes output with subme > 7 due to different nonzero counts being stored during qpel RD.

15 years agofix compilation with GCC-4.3+
Guillaume Poirier [Fri, 5 Dec 2008 21:26:55 +0000 (22:26 +0100)]
fix compilation with GCC-4.3+

15 years agoHigh Profile allows 25% higher maxbitrate/cpb
Fiona Glaser [Sun, 30 Nov 2008 07:13:58 +0000 (23:13 -0800)]
High Profile allows 25% higher maxbitrate/cpb
Correct level detection to take this into account.

15 years agos/nasm/yasm in VS project file
BugMaster [Sat, 29 Nov 2008 22:04:29 +0000 (14:04 -0800)]
s/nasm/yasm in VS project file

15 years agoCosmetic: update various file headers.
Fiona Glaser [Sat, 29 Nov 2008 12:49:18 +0000 (04:49 -0800)]
Cosmetic: update various file headers.

15 years agoadd date and compiler to `x264 --version`
Loren Merritt [Sat, 29 Nov 2008 11:54:02 +0000 (11:54 +0000)]
add date and compiler to `x264 --version`

15 years ago10L in r1041
Fiona Glaser [Fri, 28 Nov 2008 22:32:11 +0000 (14:32 -0800)]
10L in r1041

15 years agoSignificantly faster CABAC and CAVLC residual coding and bit cost calculation
Fiona Glaser [Fri, 28 Nov 2008 03:37:56 +0000 (19:37 -0800)]
Significantly faster CABAC and CAVLC residual coding and bit cost calculation
Early-terminate in residual writing using stored nnz counts
To allow the above, store nnz counts for luma and chroma DC
Add assembly functions to find the last nonzero coefficient in a block
Overall ~1.9% faster at subme9+8x8dct+qp25 with CAVLC, ~0.7% faster with CABAC
Note this changes output slightly with CABAC RDO because it requires always storing correct nnz values during RDO, which wasn't done before in cases it wasn't useful.
CAVLC output should be equivalent.

15 years agodequant_4x4_dc assembly
Fiona Glaser [Thu, 27 Nov 2008 07:42:55 +0000 (23:42 -0800)]
dequant_4x4_dc assembly
About 3.5x faster DC dequant on Conroe

15 years agofix an overflow in dct4x4dc_mmx
Loren Merritt [Thu, 27 Nov 2008 02:37:46 +0000 (02:37 +0000)]
fix an overflow in dct4x4dc_mmx
(unlikely to have occurred in any real video)

15 years agoRemove nasm support
Fiona Glaser [Wed, 26 Nov 2008 00:30:39 +0000 (16:30 -0800)]
Remove nasm support
Nasm won't correctly parse the SSE4 code introduced a few revisions ago, so we're removing support.
Users should upgrade to yasm 0.6.1 or later.

15 years agoFix rare warning messages in ratecontrol due to r1020
BugMaster [Tue, 25 Nov 2008 23:11:24 +0000 (15:11 -0800)]
Fix rare warning messages in ratecontrol due to r1020

15 years agoFix MSVC compilation and clean up MSVC build file
BugMaster [Tue, 25 Nov 2008 23:10:43 +0000 (15:10 -0800)]
Fix MSVC compilation and clean up MSVC build file
Remove Release64 which never worked anyways.

15 years agoFaster width4 SSD+SATD, SSE4 optimizations
Fiona Glaser [Tue, 25 Nov 2008 09:04:26 +0000 (01:04 -0800)]
Faster width4 SSD+SATD, SSE4 optimizations
Do satd 4x8 by transposing the two blocks' positions and running satd 8x4.
Use pinsrd (SSE4) for faster width4 SSD
Globally replace movlhps with punpcklqdq (it seems to be faster on Conroe)
Move mask_misalign declaration to cpu.h to avoid warning in encoder.c.
These optimizations help on Nehalem, Phenom, and Penryn CPUs.

15 years agofix indentation, whitespace cleanup, more consistent indentation of macro backslashes
Guillaume Poirier [Tue, 25 Nov 2008 16:27:27 +0000 (17:27 +0100)]
fix indentation, whitespace cleanup, more consistent indentation of macro backslashes

15 years agoChange some macros to be more sensitive to memory alignment, thus avoiding
David Wolstencroft [Sat, 22 Nov 2008 16:54:38 +0000 (17:54 +0100)]
Change some macros to be more sensitive to memory alignment, thus avoiding
useless loads/stores and calculations of permutation vectors.
Affected functions are all of mc_luma, mc_chroma, 'get_ref', SATD, SA8D and deblock.
Gains globally vary from ~5% - 15% on a depending on settings running on a 1.42 ghz G4.

15 years agorefactor satd. 20KB smaller binary.
Loren Merritt [Fri, 7 Nov 2008 05:31:24 +0000 (05:31 +0000)]
refactor satd. 20KB smaller binary.
refactor sa8d. slightly faster.
more checkasm for hadamard.

15 years agoFix crash with threads and SSEMisalign on Phenom
Fiona Glaser [Tue, 25 Nov 2008 05:56:24 +0000 (21:56 -0800)]
Fix crash with threads and SSEMisalign on Phenom
Misalign mask needed to be set separately for each encoding thread.

15 years agoPhenom CPU optimizations
Fiona Glaser [Fri, 21 Nov 2008 11:39:11 +0000 (03:39 -0800)]
Phenom CPU optimizations
Faster hpel_filter by using unaligned loads instead of emulated PALIGNR
Faster hpel_filter on 64-bit by using the 32-bit version (the cost of emulated PALIGNR is high enough that the savings from caching intermediate values is not worth it).
Add support for misaligned_mask on Phenom: ~2% faster hpel_filter, ~4% faster width16 multisad, 7% faster width20 get_ref.
Replace width12 mmx with width16 sse on Phenom and Nehalem: 32% faster width12 get_ref on Phenom.
Merge cpu-32.asm and cpu-64.asm
Thanks to Easy123 for contributing a Phenom box for a weekend so I could write these optimizations.

15 years agoA few tweaks to decimate asm
Fiona Glaser [Fri, 21 Nov 2008 04:11:14 +0000 (20:11 -0800)]
A few tweaks to decimate asm
A little bit faster on both 32-bit and 64-bit

15 years agoNehalem optimization part 2: SSE2 width-8 SAD
Fiona Glaser [Thu, 13 Nov 2008 00:50:31 +0000 (16:50 -0800)]
Nehalem optimization part 2: SSE2 width-8 SAD
Helps a bit on Phenom as well
~25% faster width8 multiSAD on Nehalem

15 years agoAdd subme=0 (fullpel motion estimation only)
Fiona Glaser [Tue, 11 Nov 2008 07:34:02 +0000 (23:34 -0800)]
Add subme=0 (fullpel motion estimation only)
Only for experimental purposes and ultra-fast encoding.  Probably not a good idea for firstpass.

15 years agoFix minor memory leak in r1022
Fiona Glaser [Mon, 10 Nov 2008 23:34:48 +0000 (15:34 -0800)]
Fix minor memory leak in r1022

15 years agor1024 borked checkasm
Fiona Glaser [Mon, 10 Nov 2008 23:32:06 +0000 (15:32 -0800)]
r1024 borked checkasm
Remove idct/dct2x2 from checkasm as they are no longer in dctf

15 years agoFaster chroma encoding
Fiona Glaser [Mon, 10 Nov 2008 01:39:21 +0000 (17:39 -0800)]
Faster chroma encoding
9-12% faster chroma encode.
Move all functions for handling chroma DC that don't have assembly versions to macroblock.c and inline them, along with a few other tweaks.

15 years agoVarious cosmetics and minor fixes
Fiona Glaser [Mon, 10 Nov 2008 01:34:31 +0000 (17:34 -0800)]
Various cosmetics and minor fixes
Disable hadamard_ac sse2/ssse3 under stack_mod4
Fix one MSVC compilation warning
Fix compilation in debug mode in certain cases on x64
Remove eval.c from MSVC project
Fix crash when VBV is used in CQP mode
Patches by MasterNobody

15 years agoFaster b-adapt + adaptive quantization
Fiona Glaser [Sun, 9 Nov 2008 04:16:17 +0000 (20:16 -0800)]
Faster b-adapt + adaptive quantization
Factor out pow to be only called once per macroblock.  Speeds up b-adapt, especially b-adapt 2, considerably.
Speed boost is as high as 24% with b-adapt 2 + b-frames 16.

15 years agoFaster CABAC residual encoding
Fiona Glaser [Fri, 7 Nov 2008 19:39:43 +0000 (11:39 -0800)]
Faster CABAC residual encoding
6% faster block_residual_write_cabac in RD mode.

15 years agoFix potential crash in the case that the input statsfile is too short
Fiona Glaser [Thu, 6 Nov 2008 03:51:59 +0000 (19:51 -0800)]
Fix potential crash in the case that the input statsfile is too short
Also resolve various other potential weirdness (such as multiple copies of the same error message in threaded mode).

15 years agoInitial Nehalem CPU optimizations
Fiona Glaser [Wed, 5 Nov 2008 11:11:45 +0000 (03:11 -0800)]
Initial Nehalem CPU optimizations
movaps/movups are no longer equivalent to their integer equivalents on the Nehalem, so that substitution is removed.
Nehalem has a much lower cacheline split penalty than previous Intel CPUs, so cacheline workarounds are no longer necessary.
Thanks to Intel for providing Avail Media with the pre-release Nehalem CPU needed to prepare these (and other not-yet-committed) optimizations.
Overall speed improvement with Nehalem vs Penryn at the same clock speed is around 40%.

15 years agoFix potential infinite loop in VBV under GCC 4.2
Gabriel Bouvigne [Tue, 4 Nov 2008 17:56:03 +0000 (09:56 -0800)]
Fix potential infinite loop in VBV under GCC 4.2

15 years agoEncoder_reconfig: esa/tesa can only be enabled if they were on to begin with
Fiona Glaser [Tue, 4 Nov 2008 06:59:49 +0000 (22:59 -0800)]
Encoder_reconfig: esa/tesa can only be enabled if they were on to begin with
Bug report by kemuri-_9.

15 years agoFix bug in hadamard_ac SSE assembly
Loren Merritt [Thu, 30 Oct 2008 07:47:09 +0000 (00:47 -0700)]
Fix bug in hadamard_ac SSE assembly
Some extreme inputs could cause overflows.

15 years agoFull sub8x8 RD mode decision
Fiona Glaser [Wed, 29 Oct 2008 03:35:15 +0000 (20:35 -0700)]
Full sub8x8 RD mode decision
Small speed penalty with p4x4 enabled, but significant quality gain at subme >= 6
As before, gain is proportional to the amount of p4x4 actually useful in a given input at the given bitrate.

15 years agoOptimize CABAC bit cost calculation
Fiona Glaser [Sat, 25 Oct 2008 08:50:08 +0000 (01:50 -0700)]
Optimize CABAC bit cost calculation
Speed up cabac mvd and add new precalculated transition/entropy table.
Add "noup" function for cabac operations to not update the state table when it isn't necessary.
1-3% faster macroblock_size_cabac.
Cosmetics

15 years agoReplace "git-command" with "git command" in version.sh for git 1.6 support
Anders Ossowicki [Fri, 24 Oct 2008 05:36:11 +0000 (22:36 -0700)]
Replace "git-command" with "git command" in version.sh for git 1.6 support

15 years agoAdd assembly version of CAVLC 8x8dct interleave
Loren Merritt [Thu, 23 Oct 2008 20:45:04 +0000 (13:45 -0700)]
Add assembly version of CAVLC 8x8dct interleave
Faster CAVLC encoding and RDO with 8x8dct

15 years agoAdd support for psy-rd/trellis to encoder_reconfig
Alexander Strange [Wed, 22 Oct 2008 22:55:30 +0000 (15:55 -0700)]
Add support for psy-rd/trellis to encoder_reconfig

15 years agoFix Darwin speed regression
Alexander Strange [Wed, 22 Oct 2008 22:00:43 +0000 (15:00 -0700)]
Fix Darwin speed regression

15 years agoFurther improve prediction of bitrate and VBV in threaded mode
Gabriel Bouvigne [Wed, 22 Oct 2008 21:48:47 +0000 (14:48 -0700)]
Further improve prediction of bitrate and VBV in threaded mode

15 years agoSub-8x8 Qpel-RD in P-frames
Fiona Glaser [Wed, 22 Oct 2008 20:37:09 +0000 (13:37 -0700)]
Sub-8x8 Qpel-RD in P-frames
Improves quality when using p8x4/p4x8/p4x4 subpartitions
Benefit is proportional to how many sub-8x8 partitions are used; helps most at high bitrates and low resolutions.

15 years agoFaster qpel-RD
Fiona Glaser [Wed, 22 Oct 2008 09:20:06 +0000 (02:20 -0700)]
Faster qpel-RD
3-4% faster qpel-RD; avoid re-checking bmv/pmv during the hex search.

15 years agoSome minor optimizations in RD refinement
Fiona Glaser [Wed, 22 Oct 2008 07:37:00 +0000 (00:37 -0700)]
Some minor optimizations in RD refinement
Don't write b subpartition in CABAC RDO
Calculate nonzero count in i4x4 CAVLC RDO

15 years agoFaster deblocking when p4x4 isn't used
Fiona Glaser [Wed, 22 Oct 2008 03:17:18 +0000 (20:17 -0700)]
Faster deblocking when p4x4 isn't used
Most of the MV checks can be skipped, resulting in faster strength calculation

15 years agoPrint profile and level information upon starting encode
Fiona Glaser [Wed, 22 Oct 2008 02:38:21 +0000 (19:38 -0700)]
Print profile and level information upon starting encode
Previously level was only printed as part of autodetect, and only in verbose mode.

15 years agoFix possible crash in trellis at very low QPs
Fiona Glaser [Wed, 22 Oct 2008 00:10:46 +0000 (17:10 -0700)]
Fix possible crash in trellis at very low QPs

15 years agoAdd assembly versions of decimate_score
Fiona Glaser [Tue, 21 Oct 2008 21:59:07 +0000 (14:59 -0700)]
Add assembly versions of decimate_score
3-7x faster decimation, 1-3% faster overall

15 years agoFix typo in subme8/9 lossless qpel-RD
Fiona Glaser [Sat, 18 Oct 2008 10:40:59 +0000 (03:40 -0700)]
Fix typo in subme8/9 lossless qpel-RD
Slightly improves compression.

15 years agoExtend trellis to support luma/chroma DC and chroma AC
Fiona Glaser [Thu, 16 Oct 2008 10:17:53 +0000 (03:17 -0700)]
Extend trellis to support luma/chroma DC and chroma AC
Small speed loss in trellis 1, slightly larger in trellis 2, but significant quality improvement.

15 years agorm gtk, avc2avi.
Loren Merritt [Fri, 3 Oct 2008 02:57:08 +0000 (20:57 -0600)]
rm gtk, avc2avi.
I don't remember why I allowed a gui into the repository in the first place. There's nothing that makes this one special relative to all the other x264 guis.
avc2avi doesn't compile since we removed the bitstream reader. And avc doesn't belong in avi.

15 years agoResolve quality regression in r996
Fiona Glaser [Fri, 3 Oct 2008 01:11:13 +0000 (18:11 -0700)]
Resolve quality regression in r996
Accidentally removed the wrong line of code.  I think this classifies as a "10l".
Thanks to techouse for initial bug report and skystrife for helping me find it.

15 years agoFix minor memory leak accidentally added with the addition of b-adapt 2
Ralf Terdic [Thu, 2 Oct 2008 15:52:33 +0000 (08:52 -0700)]
Fix minor memory leak accidentally added with the addition of b-adapt 2

15 years agoRework subme system, add RD refinement in B-frames
Fiona Glaser [Wed, 1 Oct 2008 01:34:56 +0000 (18:34 -0700)]
Rework subme system, add RD refinement in B-frames
The new system is as follows: subme6 is RD in I/P frames, subme7 is RD in all frames, subme8 is RD refinement in I/P frames, and subme9 is RD refinement in all frames.
subme6 == old subme6, subme7 == old subme6+brdo, subme8 == old subme7+brdo, subme9 == no equivalent
--b-rdo has, accordingly, been removed.  --bime has also been removed, and instead enabled automatically at subme >= 5.
RD refinement in B-frames (subme9) includes both qpel-RD and an RD version of bime.

15 years agoFix potential miscompilation of some inline asm
Fiona Glaser [Mon, 29 Sep 2008 07:11:38 +0000 (00:11 -0700)]
Fix potential miscompilation of some inline asm
Caused problems under some gcc 4.x versions with predictive lossless

15 years agoReplace High 4:4:4 profile lossless with High 4:4:4 Predictive.
Fiona Glaser [Sat, 27 Sep 2008 23:37:27 +0000 (16:37 -0700)]
Replace High 4:4:4 profile lossless with High 4:4:4 Predictive.
This improves lossless compression by about 4-25% depending on source.
The benefit is generally higher for intra-only compression.
Also add support for 8x8dct and i8x8 blocks in lossless mode; this improves compression very slightly.
In some rare cases 8x8dct can hurt compression in lossless mode, but its usually helpful, albeit marginally.
Note that 8x8dct is only available with CABAC as it is never useful with CAVLC.
High 4:4:4 Predictive replaced the previous profile in a 2007 revision to the H.264 standard.
The only known compliant decoder for this profile is the latest version of CoreAVC.
As I write this, JM does not actually correctly decode this profile.
Hopefully this lack of support will soon change with this commit, as x264 will be (to my knowledge) the first compliant encoder.

15 years agoFix typo in progress indicator when using piped input
Fiona Glaser [Fri, 26 Sep 2008 16:19:56 +0000 (09:19 -0700)]
Fix typo in progress indicator when using piped input

15 years agoavg_weight_ssse3
Loren Merritt [Mon, 22 Sep 2008 10:17:35 +0000 (04:17 -0600)]
avg_weight_ssse3

15 years agofix bitstream writer on bigendian 64bit (regression in r903)
Loren Merritt [Sat, 20 Sep 2008 14:41:17 +0000 (08:41 -0600)]
fix bitstream writer on bigendian 64bit (regression in r903)

15 years agoremove authors whose code no longer exists
Loren Merritt [Sat, 20 Sep 2008 05:52:11 +0000 (23:52 -0600)]
remove authors whose code no longer exists

15 years agomore diagnostics when configure finds an unsuitable assembler
Loren Merritt [Mon, 15 Sep 2008 11:00:26 +0000 (05:00 -0600)]
more diagnostics when configure finds an unsuitable assembler

15 years agoMake x264 progress indicator more concise
Fiona Glaser [Fri, 26 Sep 2008 16:19:56 +0000 (09:19 -0700)]
Make x264 progress indicator more concise
Now the % indicator should be readable on the header of a minimized window on Windows systems.

15 years agoFix deblocking + threads + AQ bug
Fiona Glaser [Mon, 22 Sep 2008 05:17:34 +0000 (22:17 -0700)]
Fix deblocking + threads + AQ bug
At low QPs, with threads and deblocking on, deblocking could be improperly disabled.
Revision in which this bug was introduced is unknown; it may be as old as b_variable_qp in x264 itself.

15 years agoResolve possible crash in bime, improve the fix in r985
Fiona Glaser [Sun, 21 Sep 2008 20:35:00 +0000 (13:35 -0700)]
Resolve possible crash in bime, improve the fix in r985

15 years agoFix rare crash issue in b-adapt
Fiona Glaser [Sun, 21 Sep 2008 02:36:07 +0000 (19:36 -0700)]
Fix rare crash issue in b-adapt
Regression *probably* in r979

15 years agoMerging Holger's GSOC branch part 1: hpel_filter speedups
Holger Lubitz [Sat, 20 Sep 2008 09:36:55 +0000 (02:36 -0700)]
Merging Holger's GSOC branch part 1: hpel_filter speedups

15 years agor980 borked weighted bime
Loren Merritt [Sat, 20 Sep 2008 18:31:10 +0000 (12:31 -0600)]
r980 borked weighted bime

15 years agoDisable I_PCM with psy-RD
Fiona Glaser [Sat, 20 Sep 2008 08:39:16 +0000 (01:39 -0700)]
Disable I_PCM with psy-RD
psy-RD seems to put the PCM threshold a bit lower than it should be, so PCM is now disabled under psy-RD.

15 years agoMerge avg and avg_weight
Fiona Glaser [Fri, 19 Sep 2008 16:21:34 +0000 (09:21 -0700)]
Merge avg and avg_weight
avg_weight no longer has to be special-cased in the code; faster weightb

15 years agoRewrite avg/avg_weight to take two source pointers
Fiona Glaser [Thu, 18 Sep 2008 04:25:05 +0000 (21:25 -0700)]
Rewrite avg/avg_weight to take two source pointers
This allows the use of get_ref instead of mc_luma almost everywhere for bipred

15 years agoUse low-resolution lookahead motion vectors as an extra predictor
Fiona Glaser [Wed, 17 Sep 2008 07:33:37 +0000 (00:33 -0700)]
Use low-resolution lookahead motion vectors as an extra predictor
Improves quality considerably (0-5%) in 1pass/CRF mode, especially with lower --me values and complex motion.
Reverses the order of lowres lookahead search to improve the usefulness of the extra predictors.

15 years agoAdd missing free() for f_qp_offset in frame.c
Fiona Glaser [Wed, 17 Sep 2008 05:44:10 +0000 (22:44 -0700)]
Add missing free() for f_qp_offset in frame.c