]> git.sesse.net Git - x264/log
x264
14 years agoFix regression in r1566
Anton Mitrofanov [Wed, 19 May 2010 17:07:03 +0000 (21:07 +0400)]
Fix regression in r1566
Intra stats need to be kept track of for fast intra decision.

14 years agoFix rc-lookahead in encoding options SEI in 2-pass with VBV
Fiona Glaser [Tue, 18 May 2010 18:53:32 +0000 (11:53 -0700)]
Fix rc-lookahead in encoding options SEI in 2-pass with VBV

14 years agoReduce memory usage in 2-pass with b-adapt 2
Loren Merritt [Mon, 17 May 2010 21:08:37 +0000 (14:08 -0700)]
Reduce memory usage in 2-pass with b-adapt 2

14 years agoOverhaul CABAC: faster, less cache usage
Fiona Glaser [Sat, 15 May 2010 21:48:58 +0000 (14:48 -0700)]
Overhaul CABAC: faster, less cache usage
Horribly munge up the CABAC tables to allow deduplication of some data.
Saves 256 bytes of L1d cache in non-RD, 512 bytes in RD.
Add asm versions of bypass and terminal; save L1i cache by re-using putbyte code.
Further optimize encode_decision.
All 3 primary CABAC functions fit in under 256 bytes of code total on x86_64.

14 years agoFix typo in pulldown
Kieran Kunhya [Thu, 13 May 2010 18:13:35 +0000 (19:13 +0100)]
Fix typo in pulldown

14 years agoFix bitrate calculation in progress status
Anton Mitrofanov [Wed, 12 May 2010 18:05:34 +0000 (22:05 +0400)]
Fix bitrate calculation in progress status
Was slightly incorrect due to using pts, which is out of order.

14 years agoFix crash with sliced-threads on Phenom
Anton Mitrofanov [Tue, 11 May 2010 21:57:38 +0000 (01:57 +0400)]
Fix crash with sliced-threads on Phenom

14 years agoFix condition for printing rc=cbr in options SEI
Fiona Glaser [Tue, 11 May 2010 05:59:12 +0000 (22:59 -0700)]
Fix condition for printing rc=cbr in options SEI
Also fix crf-max formatting.

14 years agoShrink even more constant arrays
Henrik Gramner [Mon, 10 May 2010 21:27:36 +0000 (23:27 +0200)]
Shrink even more constant arrays

14 years agoAdd API function to trigger intra refresh
Fiona Glaser [Sat, 8 May 2010 19:07:13 +0000 (12:07 -0700)]
Add API function to trigger intra refresh
Useful for interactive applications where the encoder knows that packet loss has occurred on the client.
Full documentation is in x264.h.

14 years agoFix intra refresh behavior with I-frames
Fiona Glaser [Sat, 8 May 2010 18:58:22 +0000 (11:58 -0700)]
Fix intra refresh behavior with I-frames
Intra refresh still allows I-frames (for scenecuts/etc).
Now I-frames count as a full refresh, as opposed to instantly triggering a refresh.

14 years agoMore cosmetics
Anton Mitrofanov [Thu, 6 May 2010 17:03:31 +0000 (10:03 -0700)]
More cosmetics

14 years agoFix unresolved symbol in r1573
Fiona Glaser [Thu, 6 May 2010 07:53:20 +0000 (00:53 -0700)]
Fix unresolved symbol in r1573
gnu ld didn't complain, but some other linkers did.

14 years agoRemove unnecessary --enable options
Steven Walters [Wed, 5 May 2010 23:54:04 +0000 (19:54 -0400)]
Remove unnecessary --enable options
Change --enable-visualize to actually check for X11 support.

14 years agoDon't force row QPs to integer values with VBV
Fiona Glaser [Tue, 4 May 2010 04:27:16 +0000 (21:27 -0700)]
Don't force row QPs to integer values with VBV
VBV should no longer raise the bitrate of the video.  That is, at a given quality level or average bitrate, turning on VBV should only lower the bitrate.
This isn't quite true if adaptive quant is off, but nobody should be doing that anyways.
Also may result in slightly more accurate per-row VBV ratecontrol.

14 years agoAdd field-order detection to y4m demuxer
James Darnley [Sun, 2 May 2010 23:30:50 +0000 (16:30 -0700)]
Add field-order detection to y4m demuxer

14 years agoFix sliced-threads + interlaced
Fiona Glaser [Sun, 2 May 2010 18:45:15 +0000 (11:45 -0700)]
Fix sliced-threads + interlaced
Broken in r1546.

14 years agoImprove temporal MV prediction
Fiona Glaser [Sun, 2 May 2010 18:41:36 +0000 (11:41 -0700)]
Improve temporal MV prediction
Predict based on the results of p16x16 search, not final MVs.
This lets us get predictions even if mode decision chose intra.
Also improves cache coherency.

14 years agoMore accurate MV prediction on edges in lookahead
Fiona Glaser [Sun, 2 May 2010 02:34:14 +0000 (19:34 -0700)]
More accurate MV prediction on edges in lookahead

14 years agoError out on invalid input stride
Fiona Glaser [Sun, 2 May 2010 02:32:01 +0000 (19:32 -0700)]
Error out on invalid input stride
Might catch some crashes due to buggy calling applications.

14 years agoRemove unnecessary debugging assert
Fiona Glaser [Sat, 1 May 2010 07:18:01 +0000 (00:18 -0700)]
Remove unnecessary debugging assert
Shouldn't have been in r1568 to begin with.

14 years agoShrink some more constant arrays
Fiona Glaser [Fri, 30 Apr 2010 20:45:50 +0000 (13:45 -0700)]
Shrink some more constant arrays

14 years agoDeduplicate asm constants, automate name prefixing
Fiona Glaser [Fri, 30 Apr 2010 18:36:19 +0000 (11:36 -0700)]
Deduplicate asm constants, automate name prefixing
Auto-prefix global constants with x264_ in cextern.
Eliminate x264_ prefix from asm files; automate it in cglobal.
Deduplicate asm constants wherever possible to save data cache (move them to a new const-a.asm).
Remove x264_emms() entirely on non-x86 (don't even call an empty function).
Add cextern_naked for a non-prefixed cextern (used in checkasm).

14 years agoShrink a few x86 asm functions
Fiona Glaser [Fri, 30 Apr 2010 16:57:55 +0000 (09:57 -0700)]
Shrink a few x86 asm functions
Add a few more instructions to cut down on the use of the 4-byte addressing mode.

14 years agoMake options SEI use weight* instead of wpred*
Fiona Glaser [Fri, 30 Apr 2010 02:53:59 +0000 (19:53 -0700)]
Make options SEI use weight* instead of wpred*
More intuitive and maps more reasonably to the CLI options.
Breaks statsfile backwards-compatibility.

14 years agor1548 broke subme < 3 + p8x8/b8x8
Loren Merritt [Thu, 29 Apr 2010 17:35:25 +0000 (17:35 +0000)]
r1548 broke subme < 3 + p8x8/b8x8
Caused significantly worse compression.  Preset-wise, only affected veryfast.
Fixed by not modifying mvc in-place.

14 years agoMore write-combining
Henrik Gramner [Mon, 26 Apr 2010 23:44:33 +0000 (01:44 +0200)]
More write-combining

14 years agoReduce lookahead memory usage, cache misses
Fiona Glaser [Mon, 26 Apr 2010 22:10:11 +0000 (15:10 -0700)]
Reduce lookahead memory usage, cache misses
Merge lowres_types with lowres_costs.

14 years agoFix build on x86 with asm on but SSE off
Fiona Glaser [Sun, 25 Apr 2010 21:54:29 +0000 (14:54 -0700)]
Fix build on x86 with asm on but SSE off

14 years agoDon't calculate ref/partition stats if not necessary
Fiona Glaser [Sat, 24 Apr 2010 20:55:51 +0000 (13:55 -0700)]
Don't calculate ref/partition stats if not necessary

14 years agoSplit out MV prediction into mvpred.c
Fiona Glaser [Sat, 24 Apr 2010 20:07:18 +0000 (13:07 -0700)]
Split out MV prediction into mvpred.c
Make common/macroblock.c a bit less gigantic.

14 years agoFix mv predictor clipping on non-x86 (regression in r1548)
Loren Merritt [Sat, 24 Apr 2010 16:22:14 +0000 (16:22 +0000)]
Fix mv predictor clipping on non-x86 (regression in r1548)

14 years agoMove getopt.c to x264cli sources from libx264
Anton Mitrofanov [Fri, 23 Apr 2010 20:26:13 +0000 (00:26 +0400)]
Move getopt.c to x264cli sources from libx264
Only affects builds on systems without getopt.c.

14 years agoMove deblocking code to a separate file
Fiona Glaser [Thu, 22 Apr 2010 19:53:07 +0000 (12:53 -0700)]
Move deblocking code to a separate file
Should clean up frame.c a bit.

14 years agofix ffms demuxer to support input timebase values > 2^31
Steven Walters [Tue, 20 Apr 2010 23:48:02 +0000 (19:48 -0400)]
fix ffms demuxer to support input timebase values > 2^31

14 years agoFix 10l in cache_load changes
Fiona Glaser [Tue, 20 Apr 2010 23:53:06 +0000 (16:53 -0700)]
Fix 10l in cache_load changes
Broke constrained intra pred, probably not anything else.

14 years agoFaster fullpel predictor checking
Fiona Glaser [Tue, 20 Apr 2010 23:50:13 +0000 (16:50 -0700)]
Faster fullpel predictor checking
Also shave a few instructions off dia/hex motion estimation loops.

14 years agoFix checkasm's generation of deblock inputs (regression in r1517)
Loren Merritt [Tue, 20 Apr 2010 09:40:49 +0000 (09:40 +0000)]
Fix checkasm's generation of deblock inputs (regression in r1517)

14 years agoFix printing of bitrate when timestamps aren't available
Loren Merritt [Tue, 20 Apr 2010 09:17:18 +0000 (09:17 +0000)]
Fix printing of bitrate when timestamps aren't available
Doesn't affect x264cli, but was broken in some other apps in CFR mode.

14 years agoDon't check mv0 twice
Fiona Glaser [Tue, 20 Apr 2010 07:46:29 +0000 (00:46 -0700)]
Don't check mv0 twice
One less SAD in motion estimation.
Also rename bmv -> pmv; more accurate naming.

14 years agoRemove reordering restrictions from weightp
Fiona Glaser [Mon, 19 Apr 2010 18:02:27 +0000 (11:02 -0700)]
Remove reordering restrictions from weightp
Apparently the spec does allow two consecutive copies of the same frame in the reference list.
This involves an incredibly ugly hack to wrap around the frame number.
Very slight compression improvement.

14 years agoPrint intra chroma pred modes in stats
Fiona Glaser [Tue, 20 Apr 2010 06:34:03 +0000 (23:34 -0700)]
Print intra chroma pred modes in stats

14 years agoAdd mv0 special case in pskip chroma MC
Fiona Glaser [Mon, 19 Apr 2010 05:54:48 +0000 (22:54 -0700)]
Add mv0 special case in pskip chroma MC
Significantly faster pskip MC.

14 years agoFix build scripts to work with non-GNU tools
Francois Cartegnie [Sun, 18 Apr 2010 20:04:59 +0000 (13:04 -0700)]
Fix build scripts to work with non-GNU tools

14 years agoFaster deblock reference frame checks
Fiona Glaser [Sat, 17 Apr 2010 03:04:13 +0000 (20:04 -0700)]
Faster deblock reference frame checks
Use a lookup table to simplify logic

14 years agoFaster chroma CBP handling
Henrik Gramner [Fri, 16 Apr 2010 20:39:45 +0000 (22:39 +0200)]
Faster chroma CBP handling

14 years agoFix issues with extremely large timebases
Fiona Glaser [Fri, 16 Apr 2010 18:36:43 +0000 (11:36 -0700)]
Fix issues with extremely large timebases
With timebase denominators >= 2^30 , x264 would silently overflow and cause odd issues.
Now x264 will explicitly fail with timebase denominators >= 2^31 and work with timebase denominators 2^31 > x >= 2^30.

14 years agoMMX code for predictor rounding/clipping
Fiona Glaser [Fri, 16 Apr 2010 19:06:07 +0000 (12:06 -0700)]
MMX code for predictor rounding/clipping
Faster predictor checking at subme < 3.

14 years agoFix four minor bugs found by Clang
Fiona Glaser [Fri, 16 Apr 2010 10:06:46 +0000 (03:06 -0700)]
Fix four minor bugs found by Clang

14 years agoMove deblocking/hpel into sliced threads
Fiona Glaser [Thu, 15 Apr 2010 23:32:31 +0000 (16:32 -0700)]
Move deblocking/hpel into sliced threads
Instead of doing both as a separate pass, do them during the main encode.
This requires disabling deblocking between slices (disable_deblock_idc == 2).
Overall performance gain is about 11% on --preset superfast with sliced threads.
Doesn't reduce the amount of actual computation done: only better parallelizes it.

14 years agoPrefetch MB data in cache_load
Fiona Glaser [Wed, 14 Apr 2010 21:43:25 +0000 (14:43 -0700)]
Prefetch MB data in cache_load
Dramatically reduces L1 cache misses.
~10% faster cache_load.

14 years agoFix a ton of pessimization caused by aliasing in cache_save and cache_load
Fiona Glaser [Fri, 23 Apr 2010 19:09:37 +0000 (19:09 +0000)]
Fix a ton of pessimization caused by aliasing in cache_save and cache_load

14 years agoAdd CP128/M128 macros using SSE
Fiona Glaser [Fri, 23 Apr 2010 19:09:18 +0000 (19:09 +0000)]
Add CP128/M128 macros using SSE

14 years agoFix various early terminations with slices
Fiona Glaser [Sun, 11 Apr 2010 20:36:50 +0000 (13:36 -0700)]
Fix various early terminations with slices
Neighbouring type values (type_top, etc) are now loaded even if the MB isn't available for prediction.
Significant overall performance increase (as high as 5-10%+) with lots of slices (e.g. with slice-max-size).

14 years agoEnable --fast-pskip on fast firstpass
Anton Mitrofanov [Tue, 13 Apr 2010 17:25:42 +0000 (21:25 +0400)]
Enable --fast-pskip on fast firstpass

14 years agoMake interlaced detection in avisynth only apply to field-based input
Steven Walters [Tue, 13 Apr 2010 12:44:37 +0000 (08:44 -0400)]
Make interlaced detection in avisynth only apply to field-based input
Fixes improper flagging of progressive sources.

14 years agoSet psy=0 in lossless mode
Anton Mitrofanov [Tue, 13 Apr 2010 15:55:12 +0000 (19:55 +0400)]
Set psy=0 in lossless mode
Doesn't actually affect output, just what's written in the SEI.

14 years agoFix a use of sad_x4 that had non-mod64 stride
Loren Merritt [Sun, 11 Apr 2010 04:20:04 +0000 (04:20 +0000)]
Fix a use of sad_x4 that had non-mod64 stride
Minimal speed improvement, but fixes a violation of internal api.

14 years agoMake keyint_min auto by default
Fiona Glaser [Sat, 10 Apr 2010 20:15:30 +0000 (13:15 -0700)]
Make keyint_min auto by default
Gives more reasonable default settings when using short GOPs.

14 years agoFaster mv predictor checking at subme < 3
Fiona Glaser [Sat, 10 Apr 2010 07:49:19 +0000 (00:49 -0700)]
Faster mv predictor checking at subme < 3
Simplify the predicted MV cost check.

14 years agoSpecial case in qpel refine for subme=1
Fiona Glaser [Sat, 10 Apr 2010 07:35:50 +0000 (00:35 -0700)]
Special case in qpel refine for subme=1
~15-20% faster qpel refine with subme=1.
Some minor cleanups in refine_supel.

14 years agoCosmetics: VLC tables
Henrik Gramner [Sat, 10 Apr 2010 00:21:01 +0000 (02:21 +0200)]
Cosmetics: VLC tables

14 years agoAdd faster mv0 special case for macroblock-tree
Fiona Glaser [Sat, 10 Apr 2010 01:13:22 +0000 (18:13 -0700)]
Add faster mv0 special case for macroblock-tree
Improves performance on low-motion video.

14 years agoAdd miscompilation check for x264_clz
Fiona Glaser [Fri, 9 Apr 2010 08:49:55 +0000 (01:49 -0700)]
Add miscompilation check for x264_clz
Running a Phenom-optimized build of x264 (e.g. -march=amdfam10) on a non-Phenom CPU didn't SIGILL; instead it would silently produce incorrect output.
Now, instead, it will error out loudly.

14 years agoFixing floating-point exception in level-checking
Anton Mitrofanov [Wed, 7 Apr 2010 09:17:20 +0000 (12:17 +0300)]
Fixing floating-point exception in level-checking
Doesn't cause any issues for x264cli, but might impact some calling apps that care (e.g. Delphi apps).

14 years agoSave a few bits in multislice encoding
Fiona Glaser [Fri, 9 Apr 2010 01:44:16 +0000 (18:44 -0700)]
Save a few bits in multislice encoding
Set the initial QP for each slice to the last QP of the previous slice.

14 years agoEarly termination in 16x8/8x16 search
Alex Wright [Wed, 7 Apr 2010 15:25:55 +0000 (01:25 +1000)]
Early termination in 16x8/8x16 search
Combine the actual cost of the first partition with the predicted cost of the second to avoid searching the second when possible.
Reduces the number of times the second partition is searched by up to ~75% in non-RD mode, ~10% in RD mode.
Negligible effect on compression.

14 years agoMake MV prediction work across slice boundaries
Fiona Glaser [Wed, 7 Apr 2010 14:45:00 +0000 (07:45 -0700)]
Make MV prediction work across slice boundaries
Should improve motion search with lots of small slices, e.g. with slice-max-size.
Still restricted by sliced threads (won't cross the boundary between two threadslices).
The output-changing part of the previous patch.

14 years agoCleanup and simplification of macroblock_load
Fiona Glaser [Wed, 7 Apr 2010 14:43:46 +0000 (07:43 -0700)]
Cleanup and simplification of macroblock_load
Doesn't do anything now, but will be useful for many future changes.
Splitting out neighbour calculation will make MBAFF implementation easier.
Calculation of neighbour_frame value (actual neighbouring MBs, ignoring slices) will be useful for some future patches.

14 years agoAdd missing #include to display-x11.c
Fiona Glaser [Wed, 7 Apr 2010 10:10:03 +0000 (03:10 -0700)]
Add missing #include to display-x11.c

14 years agoAdd TFF/BFF detection to all demuxers
Steven Walters [Wed, 7 Apr 2010 02:08:21 +0000 (22:08 -0400)]
Add TFF/BFF detection to all demuxers
Fix interlaced Avisynth input, automatically weave field-based input.

14 years agoCorrectly mark output frames as BREF
Fiona Glaser [Tue, 6 Apr 2010 20:53:22 +0000 (13:53 -0700)]
Correctly mark output frames as BREF
Simplify pic_out code.

14 years agoFix HRD compliance
Kieran Kunhya [Sat, 3 Apr 2010 21:59:59 +0000 (14:59 -0700)]
Fix HRD compliance
As usual, the spec is so insanely obfuscated that it's impossible to get things right the first time.

14 years agoBetter b16x8/8x16 early termination in B-frames
Alex Wright [Sat, 3 Apr 2010 21:50:26 +0000 (14:50 -0700)]
Better b16x8/8x16 early termination in B-frames
A bit slower but up to 1-2% better compression.

14 years agoFix 10L in B-skip improvement patch
Fiona Glaser [Fri, 2 Apr 2010 19:23:52 +0000 (12:23 -0700)]
Fix 10L in B-skip improvement patch

14 years agoFix printing of SEI header with VBV + ABR
Fiona Glaser [Fri, 2 Apr 2010 10:09:48 +0000 (03:09 -0700)]
Fix printing of SEI header with VBV + ABR
SEI header shouldn't say CBR unless bitrate == maxrate.

14 years agoSimplify slicetype_frame_cost
Fiona Glaser [Fri, 2 Apr 2010 05:33:42 +0000 (22:33 -0700)]
Simplify slicetype_frame_cost
Avoid redundant calculations when VBV is on (due to the intra-only call).
Move most of the logic into per-MB code.

14 years agoFaster CABAC state copying for small partitions
Fiona Glaser [Thu, 1 Apr 2010 22:51:59 +0000 (15:51 -0700)]
Faster CABAC state copying for small partitions
Save ~25 clocks per i4x4, i8x8, and sub8x8 RD call.

14 years agoMassive cosmetic and syntax cleanup
Fiona Glaser [Wed, 31 Mar 2010 08:44:07 +0000 (01:44 -0700)]
Massive cosmetic and syntax cleanup
Convert all applicable loops to use C99 loop index syntax.
Clean up most inconsistent syntax in ratecontrol.c, visualize, ppc, etc.
Replace log(x)/log(2) constructs with log2, and similar with log10.
Fix all -Wshadow violations.
Fix visualize support.

14 years agoFix array overread in b8x16 search
Fiona Glaser [Wed, 31 Mar 2010 06:30:09 +0000 (23:30 -0700)]
Fix array overread in b8x16 search

14 years agoFaster direct check with subpartitions off
Fiona Glaser [Tue, 30 Mar 2010 02:03:13 +0000 (19:03 -0700)]
Faster direct check with subpartitions off
Also simplify the whole function a bit.

14 years agoPrint crf-max with appropriate precision in SEI
Fiona Glaser [Mon, 29 Mar 2010 09:14:25 +0000 (02:14 -0700)]
Print crf-max with appropriate precision in SEI

14 years agoFix 10l in timecode seeking
Yusuke Nakamura [Mon, 29 Mar 2010 07:05:30 +0000 (00:05 -0700)]
Fix 10l in timecode seeking

14 years agoFix 10L: Remove needless error check
Yusuke Nakamura [Mon, 29 Mar 2010 04:51:02 +0000 (13:51 +0900)]
Fix 10L: Remove needless error check
This error check was for cfr input + --timebase, but that doesn't happen, and brings about a bug with vfr input.

14 years agoDon't use 2 L1 refs with pyramid + ref=1
Fiona Glaser [Mon, 29 Mar 2010 03:40:42 +0000 (20:40 -0700)]
Don't use 2 L1 refs with pyramid + ref=1
Slightly faster encoding with ref=1.

14 years agoUpdate copyright year in SEI header
Henrik Gramner [Sat, 27 Mar 2010 00:57:23 +0000 (17:57 -0700)]
Update copyright year in SEI header

14 years agoNew "superfast" preset, much faster intra analysis
Fiona Glaser [Fri, 26 Mar 2010 22:33:20 +0000 (15:33 -0700)]
New "superfast" preset, much faster intra analysis

Especially at the fastest settings, intra analysis was taking up the majority of MB analysis time.
This patch takes a ton more shortcuts at the fastest encoding settings, decreasing compression 0.5-5% but improving speed greatly.
Also rearrange the fastest presets a bit: now we have ultrafast, superfast, veryfast, faster.
superfast is the old veryfast (but much faster due to this patch).
veryfast is between the old veryfast and faster.
faster is the same as before except with MB-tree on.

Encoding with subme >= 5 should be unaffected by this patch.

14 years agoAvoid redundant MV prediction in duplicate refs
Fiona Glaser [Thu, 25 Mar 2010 21:46:24 +0000 (14:46 -0700)]
Avoid redundant MV prediction in duplicate refs

14 years agoCosmetics in mvd handling
Henrik Gramner [Wed, 24 Mar 2010 22:27:30 +0000 (23:27 +0100)]
Cosmetics in mvd handling
Use a 2D array instead of doing manual pointer arithmetic.

14 years agoFix make uninstall on systems with executable suffixes
Fiona Glaser [Wed, 24 Mar 2010 14:25:01 +0000 (07:25 -0700)]
Fix make uninstall on systems with executable suffixes

14 years agoAdd tune for still image compression
Fiona Glaser [Tue, 23 Mar 2010 21:00:58 +0000 (14:00 -0700)]
Add tune for still image compression
There has been some demand for this from companies looking to use x264 for still image compression (it can outperform JPEG or JPEG-2000 by a factor of 2 or more).
Still image compression is a bit different; because temporal stability isn't an issue, we can get away with far more powerful psy settings.

14 years agoPad non-mod16 resolutions using the correct field
Henrik Gramner [Mon, 22 Mar 2010 01:59:50 +0000 (02:59 +0100)]
Pad non-mod16 resolutions using the correct field

Improves compression of interlaced videos with non-mod16 heights.

14 years agoDocument slow/fast firstpass in --fullhelp
Fiona Glaser [Sun, 21 Mar 2010 16:10:00 +0000 (09:10 -0700)]
Document slow/fast firstpass in --fullhelp

14 years agoFix some misattributions in profiling
Holger Lubitz [Sat, 20 Mar 2010 19:41:21 +0000 (20:41 +0100)]
Fix some misattributions in profiling
Cycles spent in load_hadamard and the avg2 w16 ssse3 cacheline split code were misattributed.

14 years agoMuch faster non-RD intra analysis
Fiona Glaser [Sun, 21 Mar 2010 00:07:12 +0000 (17:07 -0700)]
Much faster non-RD intra analysis
Since every pred mode costs at least 1 bit, move that part into the initial SATD cost.
This lets i4x4/i8x8 analysis terminate earlier.
If the cost of the predicted mode is less than the cost of signalling any other mode, early-terminate the analysis.

14 years agoFix stack alignment in sliced threads
Fiona Glaser [Wed, 17 Mar 2010 22:53:43 +0000 (15:53 -0700)]
Fix stack alignment in sliced threads
Could cause crashes when called from non-GCC-compiled applications.

14 years agoCosmetics: use sizeof() where appropriate
Henrik Gramner [Tue, 16 Mar 2010 00:46:00 +0000 (01:46 +0100)]
Cosmetics: use sizeof() where appropriate

14 years agoSplit up analyse_init
Fiona Glaser [Mon, 15 Mar 2010 07:01:57 +0000 (00:01 -0700)]
Split up analyse_init
Save some time by avoiding some unnecessary inits and moving other parts to per-thread init.

14 years agoReduce stack usage of b-adapt 2's trellis
Henrik Gramner [Mon, 15 Mar 2010 00:19:45 +0000 (01:19 +0100)]
Reduce stack usage of b-adapt 2's trellis
Also remove some redundant code.

14 years agoVarious motion estimation optimizations
Fiona Glaser [Sun, 14 Mar 2010 08:25:02 +0000 (00:25 -0800)]
Various motion estimation optimizations
Faster method of checking MV range.
Predict MVs and cache MVs/MVDs for bidir qpel-RD.
A whole bunch of other minor optimizations.
Slightly better performance and compression.