X-Git-Url: https://git.sesse.net/?a=blobdiff_plain;f=doc%2Foptimization.txt;h=a14116483d50226f6996be67c13d192b25c188d4;hb=1a592ecc476ec58be430636f1ad07a7dd86089e1;hp=f6033402bb2261f2ce2baba9f814131b71d21429;hpb=5e123bd359226afe7123540e6438a72306c75a2e;p=ffmpeg diff --git a/doc/optimization.txt b/doc/optimization.txt index f6033402bb2..a14116483d5 100644 --- a/doc/optimization.txt +++ b/doc/optimization.txt @@ -16,7 +16,7 @@ Understanding these overoptimized functions: -------------------------------------------- As many functions tend to be a bit difficult to understand because of optimizations, it can be hard to optimize them further, or write -architecture-specific versions. It is recommened to look at older +architecture-specific versions. It is recommended to look at older revisions of the interesting files (for a web frontend try ViewVC at http://svn.mplayerhq.hu/ffmpeg/trunk/). Alternatively, look into the other architecture-specific versions in @@ -28,22 +28,23 @@ NOTE: If you still don't understand some function, ask at our mailing list!!! (http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel) -What speedup justifies an optimizetion? ---------------------------------------- -Normaly with clean&simple optimizations and widely used codecs a overall -speedup of the affected codec of 0.1% is enough. These speedups accumulate -and can make a big difference after a while ... -Also if none of the following gets worse and at least one gets better then an -optimization is always a good idea even if the overall gain is less than 0.1% -(speed, binary code size, source size, source readability) -For obscure codecs noone uses, the goal is more toward keeping the code clean -small and readable than to make it 1% faster. +When is an optimization justified? +---------------------------------- +Normally, clean and simple optimizations for widely used codecs are +justified even if they only achieve an overall speedup of 0.1%. These +speedups accumulate and can make a big difference after awhile. Also, if +none of the following factors get worse due to an optimization -- speed, +binary code size, source size, source readability -- and at least one +factor improves, then an optimization is always a good idea even if the +overall gain is less than 0.1%. For obscure codecs that are not often +used, the goal is more toward keeping the code clean, small, and +readable instead of making it 1% faster. WTF is that function good for ....: ----------------------------------- -The primary purpose of that list is to avoid wasting time to optimize functions -which are rarely used +The primary purpose of this list is to avoid wasting time optimizing functions +which are rarely used. put(_no_rnd)_pixels{,_x2,_y2,_xy2} Used in motion compensation (en/decoding). @@ -150,6 +151,22 @@ The minimum guaranteed alignment is written in the .h files, for example: void (*put_pixels_clamped)(const DCTELEM *block/*align 16*/, UINT8 *pixels/*align 8*/, int line_size); +General Tips: +------------- +Use asm loops like: +asm( + "1: .... + ... + "jump_instruciton .... +Do not use C loops: +do{ + asm( + ... +}while() + +Use asm() instead of intrinsics. The latter requires a good optimizing compiler +which gcc is not. + Links: ====== @@ -183,6 +200,8 @@ Optimization guide for ARM11 (used in Nokia N800 Internet Tablet): http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211j/DDI0211J_arm1136_r1p5_trm.pdf Optimization guide for Intel XScale (used in Sharp Zaurus PDA): http://download.intel.com/design/intelxscale/27347302.pdf +Intel Wireless MMX2 Coprocessor: Programmers Reference Manual +http://download.intel.com/design/intelxscale/31451001.pdf PowerPC-specific: -----------------