From 84157cf59afcbf8aa2b43bd2807ec5696584bff3 Mon Sep 17 00:00:00 2001 From: "sgunderson@bigfoot.com" <> Date: Sat, 31 May 2008 18:47:19 -0700 Subject: [PATCH] Added a README. --- README | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 README diff --git a/README b/README new file mode 100644 index 0000000..bcccb86 --- /dev/null +++ b/README @@ -0,0 +1,85 @@ +Short README, perhaps not too useful. + +qscale (the "q" is for "quick" -- the confusion with "quantization scale" is +unfortunate, but there are only so many short names in the world) is a fast +JPEG-to-JPEG-only up- and downscaler. On my 1.2GHz Core Duo laptop (using +only one core), qscale is 3-4 times as fast as ImageMagick for downscaling +large JPEGs (~10Mpix from my digital camera) to more moderate-sized JPEGs +for web use etc. (like 640x480) without sacrificing quality. (Benchmarking must +be done with a bit of care, though, in particular due to different subsampling +options possible etc.) Most of the time in qscale's case is used on reading in +the original image using libjpeg, which is shared among the two. However, it +would probably not be realistic to exclude the libjpeg time, as most (although +not all) real-world scaling tasks would indeed need to read and decode the +source JPEG. + +Note: This is not to denounce ImageMagick in any way. It is a fine library, +capable of doing much more than qscale can ever hope to do. Comparison between +the two are mainly to provide a well-known reference, and to demonstrate that +more specific tools than usually be made faster than generic tools. + +qscale is not novel in any way, nor is it perfect (far from it; it's more of a +proof of concept than anything else) -- it is mainly a piece of engineering. +However, the following techniques deserve some kind of mention: + + - qscale recognizes that JPEGs are usually stored in the YCbCr colorspace and + not RGB. (ImageMagick can, too, if you give it the right flags, but not all + its operations are well-defined in YCbCr.) Although conversion between the + two is cheap, it is not free, and it is not needed for scaling. Thus, qscale + does not do it. + - qscale recognizes that JPEGs are stored with the color channels mostly + separate (planar) and not chunked. Scaling does not need to be done on + chunked data -- in fact, mostly, scaling is easier to do on planar data. + Thus, no conversion to chunked before scaling (and no need to convert back + to planar afterwards). (Note: Some SIMD properties might be easier to + exploit on a chunked representation. It's usually not worth it in total, + though.) + - qscale can utilize the SSE instruction set found in almost all modern + x86-compatible processors to do more work in the same amount of instructions + (It can also use the SSE3 instruction set, although the extra boost on top + of SSE is smaller. In time, it will utilize the SSE extensions known as + SSE4.1, which can help even more.) It takes care to align the image data and + memory accesses on the right boundaries wherever it makes sense. + - qscale recognizes (like almost any decent scaling program) that most + practical filter kernels are separable, so scaling can be done in two + sequential simpler passes (horizontal and vertical) instead of one. The + order does matter, though -- I've found doing the vertical pass (in + cache-friendly order, doing multiple neighboring pixels at a time to + exploit that the processor reads in entire cache lines and not individual + bytes at a time) before the horizontal to be superior, in particular + because this case is easier to SIMD-wise. + - qscale understands that JPEGs are typically subsampled; ie., that the + different color components are not stored at the same resolution. On + the web, this is typically because the eye is less sensitive to color + (chroma) information and as such much of it can safely be stored in + a lower resolution to reduce file size without much visible quality + degradation; in the JPEGs stored by a digital camera, it is simply + because much of the color information is interpolated anyway (since + the individual CCD dots are sensitive to either red, green or blue, + not all at the same time), so it would not make much sense to pretend + there is full color information. qscale does not ask libjpeg to + interpolate the "missing" color information nor to downscale the + already-downscaled color channels as ImageMagick does, but instead + does a single scaling pass from the original resolution to the final + subsampled resolution. (This is impossible for any program working + in RGB mode, or chunked YCbCr.) This increases both speed and quality, + although the effect on the latter is not particularly large. + +The following optimizations are possible but not done (yet?): + + - qscale does not do the IDCT itself, even though there is improvement + potential over libjpeg's IDCT. (There is an unmaintained and little-used fork + of libjpeg called libjpeg-mmx that demonstrates this quite well.) In fact, + since the DCT can be viewed as just another (separable, but not + time-invariant) FIR filter, the quantization scaling and IDCT could probably + be folded into the scaling in many cases, in particular those where the + filter kernel is large (ie. large amounts of scaling). + - qscale does not use multiple processors or cores (although different cores + can of course work on different images at the same time). + - qscale does not make very good use of the extra eight SSE registers found + on 64-bit x86-compatible (usually called amd64 or x86-64) machines. In + fact, out of the box it might not even compile on such machines. + +qscale is Copyright 2008 Steinar H. Gunderson , and +licensed under the GNU General Public License, version 2. The full text of +the GPLv2 can be found in the included COPYING file. -- 2.39.2