1 /* stb_image - v2.26 - public domain image loader - http://nothings.org/stb
2 no warranty implied; use at your own risk
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
8 // i.e. it should look like this:
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8/16-bit-per-channel
26 TGA (not sure what subset, if a subset)
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
33 PNM (PPM and PGM binary only)
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
42 Full documentation under "DOCUMENTATION" below.
47 See end of file for license information.
49 RECENT REVISION HISTORY:
51 2.26 (2020-07-13) many minor fixes
52 2.25 (2020-02-02) fix warnings
53 2.24 (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically
54 2.23 (2019-08-11) fix clang static analysis warning
55 2.22 (2019-03-04) gif fixes, fix warnings
56 2.21 (2019-02-25) fix typo in comment
57 2.20 (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
58 2.19 (2018-02-11) fix warning
59 2.18 (2018-01-30) fix warnings
60 2.17 (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
61 2.16 (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
62 2.15 (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
63 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
64 2.13 (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
65 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
66 2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
67 RGB-format JPEG; remove white matting in PSD;
68 allocate large structures on the stack;
69 correct channel count for PNG & BMP
70 2.10 (2016-01-22) avoid warning introduced in 2.09
71 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
73 See end of file for full revision history.
76 ============================ Contributors =========================
78 Image formats Extensions, features
79 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
80 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
81 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
82 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
83 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
84 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
85 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
86 github:urraka (animated gif) Junggon Kim (PNM comments)
87 Christopher Forseth (animated gif) Daniel Gibson (16-bit TGA)
88 socks-the-fox (16-bit PNG)
89 Jeremy Sawicki (handle all ImageNet JPGs)
90 Optimizations & bugfixes Mikhail Morozov (1-bit BMP)
91 Fabian "ryg" Giesen Anael Seghezzi (is-16-bit query)
97 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
98 Christpher Lloyd Jerry Jansson Joseph Thomson Blazej Dariusz Roszkowski
99 Phil Jordan Dave Moore Roy Eltham
100 Hayaki Saito Nathan Reed Won Chun
101 Luke Graham Johan Duparc Nick Verigakis the Horde3D community
102 Thomas Ruf Ronny Chevalier github:rlyeh
103 Janez Zemva John Bartholomew Michal Cichon github:romigrou
104 Jonathan Blow Ken Hamada Tero Hanninen github:svdijk
105 Laurent Gomila Cort Stratton github:snagar
106 Aruelien Pocheville Sergio Gonzalez Thibault Reuille github:Zelex
107 Cass Everitt Ryamond Barbiero github:grim210
108 Paul Du Bois Engin Manap Aldo Culquicondor github:sammyhw
109 Philipp Wiesemann Dale Weiler Oriol Ferrer Mesia github:phprus
110 Josh Tobin Matthew Gregan github:poppolopoppo
111 Julian Raschke Gregory Mullen Christian Floisand github:darealshinji
112 Baldur Karlsson Kevin Schmidt JR Smith github:Michaelangel007
113 Brad Weinberger Matvey Cherevko [reserved]
114 Luca Sas Alexander Veselov Zack Middleton [reserved]
115 Ryan C. Gordon [reserved] [reserved]
116 DO NOT ADD YOUR NAME HERE
118 To add your name to the credits, pick a random blank space in the middle and fill it.
119 80% of merge conflicts on stb PRs are due to people adding their name at the end
123 #ifndef STBI_INCLUDE_STB_IMAGE_H
124 #define STBI_INCLUDE_STB_IMAGE_H
129 // - no 12-bit-per-channel JPEG
130 // - no JPEGs with arithmetic coding
131 // - GIF always returns *comp=4
133 // Basic usage (see HDR discussion below for HDR usage):
135 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
136 // // ... process data if not NULL ...
137 // // ... x = width, y = height, n = # 8-bit components per pixel ...
138 // // ... replace '0' with '1'..'4' to force that many components per pixel
139 // // ... but 'n' will always be the number that it would have been if you said 0
140 // stbi_image_free(data)
142 // Standard parameters:
143 // int *x -- outputs image width in pixels
144 // int *y -- outputs image height in pixels
145 // int *channels_in_file -- outputs # of image components in image file
146 // int desired_channels -- if non-zero, # of image components requested in result
148 // The return value from an image loader is an 'unsigned char *' which points
149 // to the pixel data, or NULL on an allocation failure or if the image is
150 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
151 // with each pixel consisting of N interleaved 8-bit components; the first
152 // pixel pointed to is top-left-most in the image. There is no padding between
153 // image scanlines or between pixels, regardless of format. The number of
154 // components N is 'desired_channels' if desired_channels is non-zero, or
155 // *channels_in_file otherwise. If desired_channels is non-zero,
156 // *channels_in_file has the number of components that _would_ have been
157 // output otherwise. E.g. if you set desired_channels to 4, you will always
158 // get RGBA output, but you can check *channels_in_file to see if it's trivially
159 // opaque because e.g. there were only 3 channels in the source image.
161 // An output image with N components has the following components interleaved
162 // in this order in each pixel:
164 // N=#comp components
167 // 3 red, green, blue
168 // 4 red, green, blue, alpha
170 // If image loading fails for any reason, the return value will be NULL,
171 // and *x, *y, *channels_in_file will be unchanged. The function
172 // stbi_failure_reason() can be queried for an extremely brief, end-user
173 // unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
174 // to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
175 // more user-friendly ones.
177 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
179 // ===========================================================================
183 // If compiling for Windows and you wish to use Unicode filenames, compile
185 // #define STBI_WINDOWS_UTF8
186 // and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert
187 // Windows wchar_t filenames to utf8.
189 // ===========================================================================
193 // stb libraries are designed with the following priorities:
196 // 2. easy to maintain
197 // 3. good performance
199 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
200 // and for best performance I may provide less-easy-to-use APIs that give higher
201 // performance, in addition to the easy-to-use ones. Nevertheless, it's important
202 // to keep in mind that from the standpoint of you, a client of this library,
203 // all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
205 // Some secondary priorities arise directly from the first two, some of which
206 // provide more explicit reasons why performance can't be emphasized.
208 // - Portable ("ease of use")
209 // - Small source code footprint ("easy to maintain")
210 // - No dependencies ("ease of use")
212 // ===========================================================================
216 // I/O callbacks allow you to read from arbitrary sources, like packaged
217 // files or some other source. Data read from callbacks are processed
218 // through a small internal buffer (currently 128 bytes) to try to reduce
221 // The three functions you must define are "read" (reads some bytes of data),
222 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
224 // ===========================================================================
228 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
229 // supported by the compiler. For ARM Neon support, you must explicitly
232 // (The old do-it-yourself SIMD API is no longer supported in the current
235 // On x86, SSE2 will automatically be used when available based on a run-time
236 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
237 // the typical path is to have separate builds for NEON and non-NEON devices
238 // (at least this is true for iOS and Android). Therefore, the NEON support is
239 // toggled by a build flag: define STBI_NEON to get NEON loops.
241 // If for some reason you do not want to use any of SIMD code, or if
242 // you have issues compiling it, you can disable it entirely by
243 // defining STBI_NO_SIMD.
245 // ===========================================================================
247 // HDR image support (disable by defining STBI_NO_HDR)
249 // stb_image supports loading HDR images in general, and currently the Radiance
250 // .HDR file format specifically. You can still load any file through the existing
251 // interface; if you attempt to load an HDR file, it will be automatically remapped
252 // to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
253 // both of these constants can be reconfigured through this interface:
255 // stbi_hdr_to_ldr_gamma(2.2f);
256 // stbi_hdr_to_ldr_scale(1.0f);
258 // (note, do not use _inverse_ constants; stbi_image will invert them
261 // Additionally, there is a new, parallel interface for loading files as
262 // (linear) floats to preserve the full dynamic range:
264 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
266 // If you load LDR images through this interface, those images will
267 // be promoted to floating point values, run through the inverse of
268 // constants corresponding to the above:
270 // stbi_ldr_to_hdr_scale(1.0f);
271 // stbi_ldr_to_hdr_gamma(2.2f);
273 // Finally, given a filename (or an open file or memory block--see header
274 // file for details) containing image data, you can query for the "most
275 // appropriate" interface to use (that is, whether the image is HDR or
278 // stbi_is_hdr(char *filename);
280 // ===========================================================================
282 // iPhone PNG support:
284 // By default we convert iphone-formatted PNGs back to RGB, even though
285 // they are internally encoded differently. You can disable this conversion
286 // by calling stbi_convert_iphone_png_to_rgb(0), in which case
287 // you will always just get the native iphone "format" through (which
288 // is BGR stored in RGB).
290 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
291 // pixel to remove any premultiplied alpha *only* if the image file explicitly
292 // says there's premultiplied data (currently only happens in iPhone images,
293 // and only if iPhone convert-to-rgb processing is on).
295 // ===========================================================================
297 // ADDITIONAL CONFIGURATION
299 // - You can suppress implementation of any of the decoders to reduce
300 // your code footprint by #defining one or more of the following
301 // symbols before creating the implementation.
311 // STBI_NO_PNM (.ppm and .pgm)
313 // - You can request *only* certain decoders and suppress all other ones
314 // (this will be more forward-compatible, as addition of new decoders
315 // doesn't require you to disable them explicitly):
325 // STBI_ONLY_PNM (.ppm and .pgm)
327 // - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
328 // want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
330 // - If you define STBI_MAX_DIMENSIONS, stb_image will reject images greater
331 // than that size (in either width or height) without further processing.
332 // This is to let programs in the wild set an upper bound to prevent
333 // denial-of-service attacks on untrusted data, as one could generate a
334 // valid image of gigantic dimensions and force stb_image to allocate a
335 // huge block of memory and spend disproportionate time decoding it. By
336 // default this is set to (1 << 24), which is 16777216, but that's still
339 #ifndef STBI_NO_STDIO
341 #endif // STBI_NO_STDIO
343 #define STBI_VERSION 1
347 STBI_default = 0, // only used for desired_channels
356 typedef unsigned char stbi_uc;
357 typedef unsigned short stbi_us;
364 #ifdef STB_IMAGE_STATIC
365 #define STBIDEF static
367 #define STBIDEF extern
371 //////////////////////////////////////////////////////////////////////////////
373 // PRIMARY API - works on images of any type
377 // load image by filename, open file, or memory buffer
382 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
383 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
384 int (*eof) (void *user); // returns nonzero if we are at end of file/data
387 ////////////////////////////////////
389 // 8-bits-per-channel interface
392 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *channels_in_file, int desired_channels);
393 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *channels_in_file, int desired_channels);
395 #ifndef STBI_NO_STDIO
396 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
397 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
398 // for stbi_load_from_file, file pointer is left pointing immediately after image
402 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
405 #ifdef STBI_WINDOWS_UTF8
406 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
409 ////////////////////////////////////
411 // 16-bits-per-channel interface
414 STBIDEF stbi_us *stbi_load_16_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
415 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
417 #ifndef STBI_NO_STDIO
418 STBIDEF stbi_us *stbi_load_16 (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
419 STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
422 ////////////////////////////////////
424 // float-per-channel interface
426 #ifndef STBI_NO_LINEAR
427 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
428 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
430 #ifndef STBI_NO_STDIO
431 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
432 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
437 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
438 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
439 #endif // STBI_NO_HDR
441 #ifndef STBI_NO_LINEAR
442 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
443 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
444 #endif // STBI_NO_LINEAR
446 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
447 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
448 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
449 #ifndef STBI_NO_STDIO
450 STBIDEF int stbi_is_hdr (char const *filename);
451 STBIDEF int stbi_is_hdr_from_file(FILE *f);
452 #endif // STBI_NO_STDIO
455 // get a VERY brief reason for failure
456 // on most compilers (and ALL modern mainstream compilers) this is threadsafe
457 STBIDEF const char *stbi_failure_reason (void);
459 // free the loaded image -- this is just free()
460 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
462 // get image dimensions & components without fully decoding
463 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
464 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
465 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len);
466 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user);
468 #ifndef STBI_NO_STDIO
469 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
470 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
471 STBIDEF int stbi_is_16_bit (char const *filename);
472 STBIDEF int stbi_is_16_bit_from_file(FILE *f);
477 // for image formats that explicitly notate that they have premultiplied alpha,
478 // we just return the colors as stored in the file. set this flag to force
479 // unpremultiplication. results are undefined if the unpremultiply overflow.
480 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
482 // indicate whether we should process iphone images back to canonical format,
483 // or just pass them through "as-is"
484 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
486 // flip the image vertically, so the first pixel in the output array is the bottom left
487 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
489 // as above, but only applies to images loaded on the thread that calls the function
490 // this function is only available if your compiler supports thread-local variables;
491 // calling it will fail to link if your compiler doesn't
492 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip);
494 // ZLIB client - used by PNG, available for other purposes
496 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
497 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
498 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
499 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
501 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
502 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
511 //// end header file /////////////////////////////////////////////////////
512 #endif // STBI_INCLUDE_STB_IMAGE_H
514 #ifdef STB_IMAGE_IMPLEMENTATION
516 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
517 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
518 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
519 || defined(STBI_ONLY_ZLIB)
520 #ifndef STBI_ONLY_JPEG
523 #ifndef STBI_ONLY_PNG
526 #ifndef STBI_ONLY_BMP
529 #ifndef STBI_ONLY_PSD
532 #ifndef STBI_ONLY_TGA
535 #ifndef STBI_ONLY_GIF
538 #ifndef STBI_ONLY_HDR
541 #ifndef STBI_ONLY_PIC
544 #ifndef STBI_ONLY_PNM
549 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
555 #include <stddef.h> // ptrdiff_t on osx
560 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
561 #include <math.h> // ldexp, pow
564 #ifndef STBI_NO_STDIO
570 #define STBI_ASSERT(x) assert(x)
574 #define STBI_EXTERN extern "C"
576 #define STBI_EXTERN extern
582 #define stbi_inline inline
587 #define stbi_inline __forceinline
590 #ifndef STBI_NO_THREAD_LOCALS
591 #if defined(__cplusplus) && __cplusplus >= 201103L
592 #define STBI_THREAD_LOCAL thread_local
593 #elif defined(__GNUC__) && __GNUC__ < 5
594 #define STBI_THREAD_LOCAL __thread
595 #elif defined(_MSC_VER)
596 #define STBI_THREAD_LOCAL __declspec(thread)
597 #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__)
598 #define STBI_THREAD_LOCAL _Thread_local
601 #ifndef STBI_THREAD_LOCAL
602 #if defined(__GNUC__)
603 #define STBI_THREAD_LOCAL __thread
609 typedef unsigned short stbi__uint16;
610 typedef signed short stbi__int16;
611 typedef unsigned int stbi__uint32;
612 typedef signed int stbi__int32;
615 typedef uint16_t stbi__uint16;
616 typedef int16_t stbi__int16;
617 typedef uint32_t stbi__uint32;
618 typedef int32_t stbi__int32;
621 // should produce compiler error if size is wrong
622 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
625 #define STBI_NOTUSED(v) (void)(v)
627 #define STBI_NOTUSED(v) (void)sizeof(v)
631 #define STBI_HAS_LROTL
634 #ifdef STBI_HAS_LROTL
635 #define stbi_lrot(x,y) _lrotl(x,y)
637 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
640 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
642 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
645 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
649 #define STBI_MALLOC(sz) malloc(sz)
650 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
651 #define STBI_FREE(p) free(p)
654 #ifndef STBI_REALLOC_SIZED
655 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
659 #if defined(__x86_64__) || defined(_M_X64)
660 #define STBI__X64_TARGET
661 #elif defined(__i386) || defined(_M_IX86)
662 #define STBI__X86_TARGET
665 #if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
666 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
667 // which in turn means it gets to use SSE2 everywhere. This is unfortunate,
668 // but previous attempts to provide the SSE2 functions with runtime
669 // detection caused numerous issues. The way architecture extensions are
670 // exposed in GCC/Clang is, sadly, not really suited for one-file libs.
671 // New behavior: if compiled with -msse2, we use SSE2 without any
672 // detection; if not, we don't use it at all.
676 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
677 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
679 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
680 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
681 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
682 // simultaneously enabling "-mstackrealign".
684 // See https://github.com/nothings/stb/issues/81 for more information.
686 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
687 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
691 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
693 #include <emmintrin.h>
697 #if _MSC_VER >= 1400 // not VC6
698 #include <intrin.h> // __cpuid
699 static int stbi__cpuid3(void)
706 static int stbi__cpuid3(void)
718 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
720 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
721 static int stbi__sse2_available(void)
723 int info3 = stbi__cpuid3();
724 return ((info3 >> 26) & 1) != 0;
728 #else // assume GCC-style if not VC++
729 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
731 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
732 static int stbi__sse2_available(void)
734 // If we're even attempting to compile this on GCC/Clang, that means
735 // -msse2 is on, which means the compiler is allowed to use SSE2
736 // instructions at will, and so are we.
745 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
750 #include <arm_neon.h>
751 // assume GCC or Clang on ARM targets
752 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
755 #ifndef STBI_SIMD_ALIGN
756 #define STBI_SIMD_ALIGN(type, name) type name
759 #ifndef STBI_MAX_DIMENSIONS
760 #define STBI_MAX_DIMENSIONS (1 << 24)
763 ///////////////////////////////////////////////
765 // stbi__context struct and start_xxx functions
767 // stbi__context structure is our basic context used by all images, so it
768 // contains all the IO context, plus some basic image information
771 stbi__uint32 img_x, img_y;
772 int img_n, img_out_n;
774 stbi_io_callbacks io;
777 int read_from_callbacks;
779 stbi_uc buffer_start[128];
780 int callback_already_read;
782 stbi_uc *img_buffer, *img_buffer_end;
783 stbi_uc *img_buffer_original, *img_buffer_original_end;
787 static void stbi__refill_buffer(stbi__context *s);
789 // initialize a memory-decode context
790 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
793 s->read_from_callbacks = 0;
794 s->callback_already_read = 0;
795 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
796 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
799 // initialize a callback-based context
800 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
803 s->io_user_data = user;
804 s->buflen = sizeof(s->buffer_start);
805 s->read_from_callbacks = 1;
806 s->callback_already_read = 0;
807 s->img_buffer = s->img_buffer_original = s->buffer_start;
808 stbi__refill_buffer(s);
809 s->img_buffer_original_end = s->img_buffer_end;
812 #ifndef STBI_NO_STDIO
814 static int stbi__stdio_read(void *user, char *data, int size)
816 return (int) fread(data,1,size,(FILE*) user);
819 static void stbi__stdio_skip(void *user, int n)
822 fseek((FILE*) user, n, SEEK_CUR);
823 ch = fgetc((FILE*) user); /* have to read a byte to reset feof()'s flag */
825 ungetc(ch, (FILE *) user); /* push byte back onto stream if valid. */
829 static int stbi__stdio_eof(void *user)
831 return feof((FILE*) user) || ferror((FILE *) user);
834 static stbi_io_callbacks stbi__stdio_callbacks =
841 static void stbi__start_file(stbi__context *s, FILE *f)
843 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
846 //static void stop_file(stbi__context *s) { }
848 #endif // !STBI_NO_STDIO
850 static void stbi__rewind(stbi__context *s)
852 // conceptually rewind SHOULD rewind to the beginning of the stream,
853 // but we just rewind to the beginning of the initial buffer, because
854 // we only use it after doing 'test', which only ever looks at at most 92 bytes
855 s->img_buffer = s->img_buffer_original;
856 s->img_buffer_end = s->img_buffer_original_end;
867 int bits_per_channel;
873 static int stbi__jpeg_test(stbi__context *s);
874 static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
875 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
879 static int stbi__png_test(stbi__context *s);
880 static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
881 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
882 static int stbi__png_is16(stbi__context *s);
886 static int stbi__bmp_test(stbi__context *s);
887 static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
888 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
892 static int stbi__tga_test(stbi__context *s);
893 static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
894 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
898 static int stbi__psd_test(stbi__context *s);
899 static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc);
900 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
901 static int stbi__psd_is16(stbi__context *s);
905 static int stbi__hdr_test(stbi__context *s);
906 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
907 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
911 static int stbi__pic_test(stbi__context *s);
912 static void *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
913 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
917 static int stbi__gif_test(stbi__context *s);
918 static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
919 static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
920 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
924 static int stbi__pnm_test(stbi__context *s);
925 static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
926 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
930 #ifdef STBI_THREAD_LOCAL
933 const char *stbi__g_failure_reason;
935 STBIDEF const char *stbi_failure_reason(void)
937 return stbi__g_failure_reason;
940 #ifndef STBI_NO_FAILURE_STRINGS
941 static int stbi__err(const char *str)
943 stbi__g_failure_reason = str;
948 static void *stbi__malloc(size_t size)
950 return STBI_MALLOC(size);
953 // stb_image uses ints pervasively, including for offset calculations.
954 // therefore the largest decoded image size we can support with the
955 // current code, even on 64-bit targets, is INT_MAX. this is not a
956 // significant limitation for the intended use case.
958 // we do, however, need to make sure our size calculations don't
959 // overflow. hence a few helper functions for size calculations that
960 // multiply integers together, making sure that they're non-negative
961 // and no overflow occurs.
963 // return 1 if the sum is valid, 0 on overflow.
964 // negative terms are considered invalid.
965 static int stbi__addsizes_valid(int a, int b)
968 // now 0 <= b <= INT_MAX, hence also
969 // 0 <= INT_MAX - b <= INTMAX.
970 // And "a + b <= INT_MAX" (which might overflow) is the
971 // same as a <= INT_MAX - b (no overflow)
972 return a <= INT_MAX - b;
975 // returns 1 if the product is valid, 0 on overflow.
976 // negative factors are considered invalid.
977 static int stbi__mul2sizes_valid(int a, int b)
979 if (a < 0 || b < 0) return 0;
980 if (b == 0) return 1; // mul-by-0 is always safe
981 // portable way to check for no overflows in a*b
982 return a <= INT_MAX/b;
985 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
986 // returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
987 static int stbi__mad2sizes_valid(int a, int b, int add)
989 return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add);
993 // returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
994 static int stbi__mad3sizes_valid(int a, int b, int c, int add)
996 return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
997 stbi__addsizes_valid(a*b*c, add);
1000 // returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
1001 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
1002 static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
1004 return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
1005 stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add);
1009 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
1010 // mallocs with size overflow checking
1011 static void *stbi__malloc_mad2(int a, int b, int add)
1013 if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
1014 return stbi__malloc(a*b + add);
1018 static void *stbi__malloc_mad3(int a, int b, int c, int add)
1020 if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
1021 return stbi__malloc(a*b*c + add);
1024 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
1025 static void *stbi__malloc_mad4(int a, int b, int c, int d, int add)
1027 if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
1028 return stbi__malloc(a*b*c*d + add);
1032 // stbi__err - error
1033 // stbi__errpf - error returning pointer to float
1034 // stbi__errpuc - error returning pointer to unsigned char
1036 #ifdef STBI_NO_FAILURE_STRINGS
1037 #define stbi__err(x,y) 0
1038 #elif defined(STBI_FAILURE_USERMSG)
1039 #define stbi__err(x,y) stbi__err(y)
1041 #define stbi__err(x,y) stbi__err(x)
1044 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
1045 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
1047 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
1049 STBI_FREE(retval_from_stbi_load);
1052 #ifndef STBI_NO_LINEAR
1053 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
1057 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
1060 static int stbi__vertically_flip_on_load_global = 0;
1062 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
1064 stbi__vertically_flip_on_load_global = flag_true_if_should_flip;
1067 #ifndef STBI_THREAD_LOCAL
1068 #define stbi__vertically_flip_on_load stbi__vertically_flip_on_load_global
1070 static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set;
1072 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip)
1074 stbi__vertically_flip_on_load_local = flag_true_if_should_flip;
1075 stbi__vertically_flip_on_load_set = 1;
1078 #define stbi__vertically_flip_on_load (stbi__vertically_flip_on_load_set \
1079 ? stbi__vertically_flip_on_load_local \
1080 : stbi__vertically_flip_on_load_global)
1081 #endif // STBI_THREAD_LOCAL
1083 static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
1085 memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
1086 ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
1087 ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
1088 ri->num_channels = 0;
1090 #ifndef STBI_NO_JPEG
1091 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri);
1094 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp, ri);
1097 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp, ri);
1100 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp, ri);
1103 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc);
1108 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp, ri);
1111 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp, ri);
1115 if (stbi__hdr_test(s)) {
1116 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri);
1117 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
1122 // test tga last because it's a crappy test!
1123 if (stbi__tga_test(s))
1124 return stbi__tga_load(s,x,y,comp,req_comp, ri);
1127 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
1130 static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels)
1133 int img_len = w * h * channels;
1136 reduced = (stbi_uc *) stbi__malloc(img_len);
1137 if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
1139 for (i = 0; i < img_len; ++i)
1140 reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
1146 static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels)
1149 int img_len = w * h * channels;
1150 stbi__uint16 *enlarged;
1152 enlarged = (stbi__uint16 *) stbi__malloc(img_len*2);
1153 if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
1155 for (i = 0; i < img_len; ++i)
1156 enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
1162 static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel)
1165 size_t bytes_per_row = (size_t)w * bytes_per_pixel;
1167 stbi_uc *bytes = (stbi_uc *)image;
1169 for (row = 0; row < (h>>1); row++) {
1170 stbi_uc *row0 = bytes + row*bytes_per_row;
1171 stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row;
1172 // swap row0 with row1
1173 size_t bytes_left = bytes_per_row;
1174 while (bytes_left) {
1175 size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
1176 memcpy(temp, row0, bytes_copy);
1177 memcpy(row0, row1, bytes_copy);
1178 memcpy(row1, temp, bytes_copy);
1181 bytes_left -= bytes_copy;
1187 static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel)
1190 int slice_size = w * h * bytes_per_pixel;
1192 stbi_uc *bytes = (stbi_uc *)image;
1193 for (slice = 0; slice < z; ++slice) {
1194 stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
1195 bytes += slice_size;
1200 static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1202 stbi__result_info ri;
1203 void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
1208 // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
1209 STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
1211 if (ri.bits_per_channel != 8) {
1212 result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
1213 ri.bits_per_channel = 8;
1216 // @TODO: move stbi__convert_format to here
1218 if (stbi__vertically_flip_on_load) {
1219 int channels = req_comp ? req_comp : *comp;
1220 stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
1223 return (unsigned char *) result;
1226 static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1228 stbi__result_info ri;
1229 void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
1234 // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
1235 STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
1237 if (ri.bits_per_channel != 16) {
1238 result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
1239 ri.bits_per_channel = 16;
1242 // @TODO: move stbi__convert_format16 to here
1243 // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
1245 if (stbi__vertically_flip_on_load) {
1246 int channels = req_comp ? req_comp : *comp;
1247 stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
1250 return (stbi__uint16 *) result;
1253 #if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR)
1254 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1256 if (stbi__vertically_flip_on_load && result != NULL) {
1257 int channels = req_comp ? req_comp : *comp;
1258 stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
1263 #ifndef STBI_NO_STDIO
1265 #if defined(_MSC_VER) && defined(STBI_WINDOWS_UTF8)
1266 STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
1267 STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
1270 #if defined(_MSC_VER) && defined(STBI_WINDOWS_UTF8)
1271 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
1273 return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
1277 static FILE *stbi__fopen(char const *filename, char const *mode)
1280 #if defined(_MSC_VER) && defined(STBI_WINDOWS_UTF8)
1282 wchar_t wFilename[1024];
1283 if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)))
1286 if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)))
1289 #if _MSC_VER >= 1400
1290 if (0 != _wfopen_s(&f, wFilename, wMode))
1293 f = _wfopen(wFilename, wMode);
1296 #elif defined(_MSC_VER) && _MSC_VER >= 1400
1297 if (0 != fopen_s(&f, filename, mode))
1300 f = fopen(filename, mode);
1306 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1308 FILE *f = stbi__fopen(filename, "rb");
1309 unsigned char *result;
1310 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1311 result = stbi_load_from_file(f,x,y,comp,req_comp);
1316 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1318 unsigned char *result;
1320 stbi__start_file(&s,f);
1321 result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
1323 // need to 'unget' all the characters in the IO buffer
1324 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1329 STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp)
1331 stbi__uint16 *result;
1333 stbi__start_file(&s,f);
1334 result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp);
1336 // need to 'unget' all the characters in the IO buffer
1337 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1342 STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp)
1344 FILE *f = stbi__fopen(filename, "rb");
1345 stbi__uint16 *result;
1346 if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file");
1347 result = stbi_load_from_file_16(f,x,y,comp,req_comp);
1353 #endif //!STBI_NO_STDIO
1355 STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels)
1358 stbi__start_mem(&s,buffer,len);
1359 return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
1362 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels)
1365 stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user);
1366 return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
1369 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1372 stbi__start_mem(&s,buffer,len);
1373 return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
1376 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1379 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1380 return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
1384 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
1386 unsigned char *result;
1388 stbi__start_mem(&s,buffer,len);
1390 result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
1391 if (stbi__vertically_flip_on_load) {
1392 stbi__vertical_flip_slices( result, *x, *y, *z, *comp );
1399 #ifndef STBI_NO_LINEAR
1400 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1402 unsigned char *data;
1404 if (stbi__hdr_test(s)) {
1405 stbi__result_info ri;
1406 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri);
1408 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1412 data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
1414 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1415 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1418 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1421 stbi__start_mem(&s,buffer,len);
1422 return stbi__loadf_main(&s,x,y,comp,req_comp);
1425 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1428 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1429 return stbi__loadf_main(&s,x,y,comp,req_comp);
1432 #ifndef STBI_NO_STDIO
1433 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1436 FILE *f = stbi__fopen(filename, "rb");
1437 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1438 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1443 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1446 stbi__start_file(&s,f);
1447 return stbi__loadf_main(&s,x,y,comp,req_comp);
1449 #endif // !STBI_NO_STDIO
1451 #endif // !STBI_NO_LINEAR
1453 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1454 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1457 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1461 stbi__start_mem(&s,buffer,len);
1462 return stbi__hdr_test(&s);
1464 STBI_NOTUSED(buffer);
1470 #ifndef STBI_NO_STDIO
1471 STBIDEF int stbi_is_hdr (char const *filename)
1473 FILE *f = stbi__fopen(filename, "rb");
1476 result = stbi_is_hdr_from_file(f);
1482 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1485 long pos = ftell(f);
1488 stbi__start_file(&s,f);
1489 res = stbi__hdr_test(&s);
1490 fseek(f, pos, SEEK_SET);
1497 #endif // !STBI_NO_STDIO
1499 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1503 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1504 return stbi__hdr_test(&s);
1512 #ifndef STBI_NO_LINEAR
1513 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1515 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
1516 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1519 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1521 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
1522 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1525 //////////////////////////////////////////////////////////////////////////////
1527 // Common code used by all image loaders
1537 static void stbi__refill_buffer(stbi__context *s)
1539 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1540 s->callback_already_read += (int) (s->img_buffer - s->img_buffer_original);
1542 // at end of file, treat same as if from memory, but need to handle case
1543 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1544 s->read_from_callbacks = 0;
1545 s->img_buffer = s->buffer_start;
1546 s->img_buffer_end = s->buffer_start+1;
1549 s->img_buffer = s->buffer_start;
1550 s->img_buffer_end = s->buffer_start + n;
1554 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1556 if (s->img_buffer < s->img_buffer_end)
1557 return *s->img_buffer++;
1558 if (s->read_from_callbacks) {
1559 stbi__refill_buffer(s);
1560 return *s->img_buffer++;
1565 #if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
1568 stbi_inline static int stbi__at_eof(stbi__context *s)
1571 if (!(s->io.eof)(s->io_user_data)) return 0;
1572 // if feof() is true, check if buffer = end
1573 // special case: we've only got the special 0 character at the end
1574 if (s->read_from_callbacks == 0) return 1;
1577 return s->img_buffer >= s->img_buffer_end;
1581 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC)
1584 static void stbi__skip(stbi__context *s, int n)
1586 if (n == 0) return; // already there!
1588 s->img_buffer = s->img_buffer_end;
1592 int blen = (int) (s->img_buffer_end - s->img_buffer);
1594 s->img_buffer = s->img_buffer_end;
1595 (s->io.skip)(s->io_user_data, n - blen);
1603 #if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM)
1606 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1609 int blen = (int) (s->img_buffer_end - s->img_buffer);
1613 memcpy(buffer, s->img_buffer, blen);
1615 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1616 res = (count == (n-blen));
1617 s->img_buffer = s->img_buffer_end;
1622 if (s->img_buffer+n <= s->img_buffer_end) {
1623 memcpy(buffer, s->img_buffer, n);
1631 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
1634 static int stbi__get16be(stbi__context *s)
1636 int z = stbi__get8(s);
1637 return (z << 8) + stbi__get8(s);
1641 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
1644 static stbi__uint32 stbi__get32be(stbi__context *s)
1646 stbi__uint32 z = stbi__get16be(s);
1647 return (z << 16) + stbi__get16be(s);
1651 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1654 static int stbi__get16le(stbi__context *s)
1656 int z = stbi__get8(s);
1657 return z + (stbi__get8(s) << 8);
1662 static stbi__uint32 stbi__get32le(stbi__context *s)
1664 stbi__uint32 z = stbi__get16le(s);
1665 return z + (stbi__get16le(s) << 16);
1669 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1671 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
1674 //////////////////////////////////////////////////////////////////////////////
1676 // generic converter from built-in img_n to req_comp
1677 // individual types do this automatically as much as possible (e.g. jpeg
1678 // does all cases internally since it needs to colorspace convert anyway,
1679 // and it never has alpha, so very few cases ). png can automatically
1680 // interleave an alpha=255 channel, but falls back to this for other cases
1682 // assume data buffer is malloced, so malloc a new one and free that one
1683 // only failure mode is malloc failing
1685 static stbi_uc stbi__compute_y(int r, int g, int b)
1687 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1691 #if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
1694 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1697 unsigned char *good;
1699 if (req_comp == img_n) return data;
1700 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1702 good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0);
1705 return stbi__errpuc("outofmem", "Out of memory");
1708 for (j=0; j < (int) y; ++j) {
1709 unsigned char *src = data + j * x * img_n ;
1710 unsigned char *dest = good + j * x * req_comp;
1712 #define STBI__COMBO(a,b) ((a)*8+(b))
1713 #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1714 // convert source image with img_n components to one with req_comp components;
1715 // avoid switch per pixel, so use switch per scanline and massive macros
1716 switch (STBI__COMBO(img_n, req_comp)) {
1717 STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255; } break;
1718 STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1719 STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255; } break;
1720 STBI__CASE(2,1) { dest[0]=src[0]; } break;
1721 STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1722 STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1]; } break;
1723 STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255; } break;
1724 STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); } break;
1725 STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255; } break;
1726 STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); } break;
1727 STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break;
1728 STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2]; } break;
1729 default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return stbi__errpuc("unsupported", "Unsupported format conversion");
1739 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
1742 static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
1744 return (stbi__uint16) (((r*77) + (g*150) + (29*b)) >> 8);
1748 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
1751 static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1756 if (req_comp == img_n) return data;
1757 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1759 good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2);
1762 return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
1765 for (j=0; j < (int) y; ++j) {
1766 stbi__uint16 *src = data + j * x * img_n ;
1767 stbi__uint16 *dest = good + j * x * req_comp;
1769 #define STBI__COMBO(a,b) ((a)*8+(b))
1770 #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1771 // convert source image with img_n components to one with req_comp components;
1772 // avoid switch per pixel, so use switch per scanline and massive macros
1773 switch (STBI__COMBO(img_n, req_comp)) {
1774 STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff; } break;
1775 STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1776 STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff; } break;
1777 STBI__CASE(2,1) { dest[0]=src[0]; } break;
1778 STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1779 STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1]; } break;
1780 STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff; } break;
1781 STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); } break;
1782 STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break;
1783 STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); } break;
1784 STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break;
1785 STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2]; } break;
1786 default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return (stbi__uint16*) stbi__errpuc("unsupported", "Unsupported format conversion");
1796 #ifndef STBI_NO_LINEAR
1797 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1801 if (!data) return NULL;
1802 output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
1803 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1804 // compute number of non-alpha components
1805 if (comp & 1) n = comp; else n = comp-1;
1806 for (i=0; i < x*y; ++i) {
1807 for (k=0; k < n; ++k) {
1808 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1812 for (i=0; i < x*y; ++i) {
1813 output[i*comp + n] = data[i*comp + n]/255.0f;
1822 #define stbi__float2int(x) ((int) (x))
1823 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1827 if (!data) return NULL;
1828 output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0);
1829 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1830 // compute number of non-alpha components
1831 if (comp & 1) n = comp; else n = comp-1;
1832 for (i=0; i < x*y; ++i) {
1833 for (k=0; k < n; ++k) {
1834 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1836 if (z > 255) z = 255;
1837 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1840 float z = data[i*comp+k] * 255 + 0.5f;
1842 if (z > 255) z = 255;
1843 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1851 //////////////////////////////////////////////////////////////////////////////
1853 // "baseline" JPEG/JFIF decoder
1855 // simple implementation
1856 // - doesn't support delayed output of y-dimension
1857 // - simple interface (only one output format: 8-bit interleaved RGB)
1858 // - doesn't try to recover corrupt jpegs
1859 // - doesn't allow partial loading, loading multiple at once
1860 // - still fast on x86 (copying globals into locals doesn't help x86)
1861 // - allocates lots of intermediate memory (full size of all components)
1862 // - non-interleaved case requires this anyway
1863 // - allows good upsampling (see next)
1865 // - upsampled channels are bilinearly interpolated, even across blocks
1866 // - quality integer IDCT derived from IJG's 'slow'
1868 // - fast huffman; reasonable integer IDCT
1869 // - some SIMD kernels for common paths on targets with SSE2/NEON
1870 // - uses a lot of intermediate memory, could cache poorly
1872 #ifndef STBI_NO_JPEG
1874 // huffman decoding acceleration
1875 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1879 stbi_uc fast[1 << FAST_BITS];
1880 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1881 stbi__uint16 code[256];
1882 stbi_uc values[256];
1884 unsigned int maxcode[18];
1885 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1891 stbi__huffman huff_dc[4];
1892 stbi__huffman huff_ac[4];
1893 stbi__uint16 dequant[4][64];
1894 stbi__int16 fast_ac[4][1 << FAST_BITS];
1896 // sizes for components, interleaved MCUs
1897 int img_h_max, img_v_max;
1898 int img_mcu_x, img_mcu_y;
1899 int img_mcu_w, img_mcu_h;
1901 // definition of jpeg image component
1912 void *raw_data, *raw_coeff;
1914 short *coeff; // progressive only
1915 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1918 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1919 int code_bits; // number of valid bits
1920 unsigned char marker; // marker seen while filling entropy buffer
1921 int nomore; // flag if we saw a marker so must stop
1930 int app14_color_transform; // Adobe APP14 tag
1933 int scan_n, order[4];
1934 int restart_interval, todo;
1937 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1938 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1939 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1942 static int stbi__build_huffman(stbi__huffman *h, int *count)
1946 // build size list for each symbol (from JPEG spec)
1947 for (i=0; i < 16; ++i)
1948 for (j=0; j < count[i]; ++j)
1949 h->size[k++] = (stbi_uc) (i+1);
1952 // compute actual symbols (from jpeg spec)
1955 for(j=1; j <= 16; ++j) {
1956 // compute delta to add to code to compute symbol id
1957 h->delta[j] = k - code;
1958 if (h->size[k] == j) {
1959 while (h->size[k] == j)
1960 h->code[k++] = (stbi__uint16) (code++);
1961 if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1963 // compute largest code + 1 for this size, preshifted as needed later
1964 h->maxcode[j] = code << (16-j);
1967 h->maxcode[j] = 0xffffffff;
1969 // build non-spec acceleration table; 255 is flag for not-accelerated
1970 memset(h->fast, 255, 1 << FAST_BITS);
1971 for (i=0; i < k; ++i) {
1973 if (s <= FAST_BITS) {
1974 int c = h->code[i] << (FAST_BITS-s);
1975 int m = 1 << (FAST_BITS-s);
1976 for (j=0; j < m; ++j) {
1977 h->fast[c+j] = (stbi_uc) i;
1984 // build a table that decodes both magnitude and value of small ACs in
1986 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1989 for (i=0; i < (1 << FAST_BITS); ++i) {
1990 stbi_uc fast = h->fast[i];
1993 int rs = h->values[fast];
1994 int run = (rs >> 4) & 15;
1995 int magbits = rs & 15;
1996 int len = h->size[fast];
1998 if (magbits && len + magbits <= FAST_BITS) {
1999 // magnitude code followed by receive_extend code
2000 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
2001 int m = 1 << (magbits - 1);
2002 if (k < m) k += (~0U << magbits) + 1;
2003 // if the result is small enough, we can fit it in fast_ac table
2004 if (k >= -128 && k <= 127)
2005 fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits));
2011 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
2014 unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
2016 int c = stbi__get8(j->s);
2017 while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
2019 j->marker = (unsigned char) c;
2024 j->code_buffer |= b << (24 - j->code_bits);
2026 } while (j->code_bits <= 24);
2030 static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
2032 // decode a jpeg huffman value from the bitstream
2033 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
2038 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2040 // look at the top FAST_BITS and determine what symbol ID it is,
2041 // if the code is <= FAST_BITS
2042 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
2046 if (s > j->code_bits)
2048 j->code_buffer <<= s;
2050 return h->values[k];
2053 // naive test is to shift the code_buffer down so k bits are
2054 // valid, then test against maxcode. To speed this up, we've
2055 // preshifted maxcode left so that it has (16-k) 0s at the
2056 // end; in other words, regardless of the number of bits, it
2057 // wants to be compared against something shifted to have 16;
2058 // that way we don't need to shift inside the loop.
2059 temp = j->code_buffer >> 16;
2060 for (k=FAST_BITS+1 ; ; ++k)
2061 if (temp < h->maxcode[k])
2064 // error! code not found
2069 if (k > j->code_bits)
2072 // convert the huffman code to the symbol id
2073 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
2074 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
2076 // convert the id to a symbol
2078 j->code_buffer <<= k;
2079 return h->values[c];
2082 // bias[n] = (-1<<n) + 1
2083 static const int stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
2085 // combined JPEG 'receive' and JPEG 'extend', since baseline
2086 // always extends everything it receives.
2087 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
2091 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
2093 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
2094 k = stbi_lrot(j->code_buffer, n);
2095 if (n < 0 || n >= (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask))) return 0;
2096 j->code_buffer = k & ~stbi__bmask[n];
2097 k &= stbi__bmask[n];
2099 return k + (stbi__jbias[n] & ~sgn);
2102 // get some unsigned bits
2103 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
2106 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
2107 k = stbi_lrot(j->code_buffer, n);
2108 j->code_buffer = k & ~stbi__bmask[n];
2109 k &= stbi__bmask[n];
2114 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
2117 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
2119 j->code_buffer <<= 1;
2121 return k & 0x80000000;
2124 // given a value that's at position X in the zigzag stream,
2125 // where does it appear in the 8x8 matrix coded as row-major?
2126 static const stbi_uc stbi__jpeg_dezigzag[64+15] =
2128 0, 1, 8, 16, 9, 2, 3, 10,
2129 17, 24, 32, 25, 18, 11, 4, 5,
2130 12, 19, 26, 33, 40, 48, 41, 34,
2131 27, 20, 13, 6, 7, 14, 21, 28,
2132 35, 42, 49, 56, 57, 50, 43, 36,
2133 29, 22, 15, 23, 30, 37, 44, 51,
2134 58, 59, 52, 45, 38, 31, 39, 46,
2135 53, 60, 61, 54, 47, 55, 62, 63,
2136 // let corrupt input sample past end
2137 63, 63, 63, 63, 63, 63, 63, 63,
2138 63, 63, 63, 63, 63, 63, 63
2141 // decode one 64-entry block--
2142 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant)
2147 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2148 t = stbi__jpeg_huff_decode(j, hdc);
2149 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2151 // 0 all the ac values now so we can do it 32-bits at a time
2152 memset(data,0,64*sizeof(data[0]));
2154 diff = t ? stbi__extend_receive(j, t) : 0;
2155 dc = j->img_comp[b].dc_pred + diff;
2156 j->img_comp[b].dc_pred = dc;
2157 data[0] = (short) (dc * dequant[0]);
2159 // decode AC components, see JPEG spec
2164 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2165 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
2167 if (r) { // fast-AC path
2168 k += (r >> 4) & 15; // run
2169 s = r & 15; // combined length
2170 j->code_buffer <<= s;
2172 // decode into unzigzag'd location
2173 zig = stbi__jpeg_dezigzag[k++];
2174 data[zig] = (short) ((r >> 8) * dequant[zig]);
2176 int rs = stbi__jpeg_huff_decode(j, hac);
2177 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2181 if (rs != 0xf0) break; // end block
2185 // decode into unzigzag'd location
2186 zig = stbi__jpeg_dezigzag[k++];
2187 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
2194 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
2198 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2200 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2202 if (j->succ_high == 0) {
2203 // first scan for DC coefficient, must be first
2204 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
2205 t = stbi__jpeg_huff_decode(j, hdc);
2206 if (t == -1) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2207 diff = t ? stbi__extend_receive(j, t) : 0;
2209 dc = j->img_comp[b].dc_pred + diff;
2210 j->img_comp[b].dc_pred = dc;
2211 data[0] = (short) (dc << j->succ_low);
2213 // refinement scan for DC coefficient
2214 if (stbi__jpeg_get_bit(j))
2215 data[0] += (short) (1 << j->succ_low);
2220 // @OPTIMIZE: store non-zigzagged during the decode passes,
2221 // and only de-zigzag when dequantizing
2222 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
2225 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2227 if (j->succ_high == 0) {
2228 int shift = j->succ_low;
2239 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2240 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
2242 if (r) { // fast-AC path
2243 k += (r >> 4) & 15; // run
2244 s = r & 15; // combined length
2245 j->code_buffer <<= s;
2247 zig = stbi__jpeg_dezigzag[k++];
2248 data[zig] = (short) ((r >> 8) << shift);
2250 int rs = stbi__jpeg_huff_decode(j, hac);
2251 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2256 j->eob_run = (1 << r);
2258 j->eob_run += stbi__jpeg_get_bits(j, r);
2265 zig = stbi__jpeg_dezigzag[k++];
2266 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
2269 } while (k <= j->spec_end);
2271 // refinement scan for these AC coefficients
2273 short bit = (short) (1 << j->succ_low);
2277 for (k = j->spec_start; k <= j->spec_end; ++k) {
2278 short *p = &data[stbi__jpeg_dezigzag[k]];
2280 if (stbi__jpeg_get_bit(j))
2281 if ((*p & bit)==0) {
2292 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
2293 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2298 j->eob_run = (1 << r) - 1;
2300 j->eob_run += stbi__jpeg_get_bits(j, r);
2301 r = 64; // force end of block
2303 // r=15 s=0 should write 16 0s, so we just do
2304 // a run of 15 0s and then write s (which is 0),
2305 // so we don't have to do anything special here
2308 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
2310 if (stbi__jpeg_get_bit(j))
2317 while (k <= j->spec_end) {
2318 short *p = &data[stbi__jpeg_dezigzag[k++]];
2320 if (stbi__jpeg_get_bit(j))
2321 if ((*p & bit)==0) {
2335 } while (k <= j->spec_end);
2341 // take a -128..127 value and stbi__clamp it and convert to 0..255
2342 stbi_inline static stbi_uc stbi__clamp(int x)
2344 // trick to use a single test to catch both cases
2345 if ((unsigned int) x > 255) {
2346 if (x < 0) return 0;
2347 if (x > 255) return 255;
2352 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
2353 #define stbi__fsh(x) ((x) * 4096)
2355 // derived from jidctint -- DCT_ISLOW
2356 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
2357 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
2360 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
2361 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
2362 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
2365 t0 = stbi__fsh(p2+p3); \
2366 t1 = stbi__fsh(p2-p3); \
2379 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
2380 t0 = t0*stbi__f2f( 0.298631336f); \
2381 t1 = t1*stbi__f2f( 2.053119869f); \
2382 t2 = t2*stbi__f2f( 3.072711026f); \
2383 t3 = t3*stbi__f2f( 1.501321110f); \
2384 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
2385 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
2386 p3 = p3*stbi__f2f(-1.961570560f); \
2387 p4 = p4*stbi__f2f(-0.390180644f); \
2393 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
2395 int i,val[64],*v=val;
2400 for (i=0; i < 8; ++i,++d, ++v) {
2401 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
2402 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
2403 && d[40]==0 && d[48]==0 && d[56]==0) {
2404 // no shortcut 0 seconds
2405 // (1|2|3|4|5|6|7)==0 0 seconds
2406 // all separate -0.047 seconds
2407 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
2408 int dcterm = d[0]*4;
2409 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
2411 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
2412 // constants scaled things up by 1<<12; let's bring them back
2413 // down, but keep 2 extra bits of precision
2414 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2415 v[ 0] = (x0+t3) >> 10;
2416 v[56] = (x0-t3) >> 10;
2417 v[ 8] = (x1+t2) >> 10;
2418 v[48] = (x1-t2) >> 10;
2419 v[16] = (x2+t1) >> 10;
2420 v[40] = (x2-t1) >> 10;
2421 v[24] = (x3+t0) >> 10;
2422 v[32] = (x3-t0) >> 10;
2426 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2427 // no fast case since the first 1D IDCT spread components out
2428 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2429 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2430 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2431 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2432 // so we want to round that, which means adding 0.5 * 1<<17,
2433 // aka 65536. Also, we'll end up with -128 to 127 that we want
2434 // to encode as 0..255 by adding 128, so we'll add that before the shift
2435 x0 += 65536 + (128<<17);
2436 x1 += 65536 + (128<<17);
2437 x2 += 65536 + (128<<17);
2438 x3 += 65536 + (128<<17);
2439 // tried computing the shifts into temps, or'ing the temps to see
2440 // if any were out of range, but that was slower
2441 o[0] = stbi__clamp((x0+t3) >> 17);
2442 o[7] = stbi__clamp((x0-t3) >> 17);
2443 o[1] = stbi__clamp((x1+t2) >> 17);
2444 o[6] = stbi__clamp((x1-t2) >> 17);
2445 o[2] = stbi__clamp((x2+t1) >> 17);
2446 o[5] = stbi__clamp((x2-t1) >> 17);
2447 o[3] = stbi__clamp((x3+t0) >> 17);
2448 o[4] = stbi__clamp((x3-t0) >> 17);
2453 // sse2 integer IDCT. not the fastest possible implementation but it
2454 // produces bit-identical results to the generic C version so it's
2455 // fully "transparent".
2456 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2458 // This is constructed to match our regular (generic) integer IDCT exactly.
2459 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2462 // dot product constant: even elems=x, odd elems=y
2463 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2465 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2466 // out(1) = c1[even]*x + c1[odd]*y
2467 #define dct_rot(out0,out1, x,y,c0,c1) \
2468 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2469 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2470 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2471 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2472 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2473 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2475 // out = in << 12 (in 16-bit, out 32-bit)
2476 #define dct_widen(out, in) \
2477 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2478 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2481 #define dct_wadd(out, a, b) \
2482 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2483 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2486 #define dct_wsub(out, a, b) \
2487 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2488 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2490 // butterfly a/b, add bias, then shift by "s" and pack
2491 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2493 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2494 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2495 dct_wadd(sum, abiased, b); \
2496 dct_wsub(dif, abiased, b); \
2497 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2498 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2501 // 8-bit interleave step (for transposes)
2502 #define dct_interleave8(a, b) \
2504 a = _mm_unpacklo_epi8(a, b); \
2505 b = _mm_unpackhi_epi8(tmp, b)
2507 // 16-bit interleave step (for transposes)
2508 #define dct_interleave16(a, b) \
2510 a = _mm_unpacklo_epi16(a, b); \
2511 b = _mm_unpackhi_epi16(tmp, b)
2513 #define dct_pass(bias,shift) \
2516 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2517 __m128i sum04 = _mm_add_epi16(row0, row4); \
2518 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2519 dct_widen(t0e, sum04); \
2520 dct_widen(t1e, dif04); \
2521 dct_wadd(x0, t0e, t3e); \
2522 dct_wsub(x3, t0e, t3e); \
2523 dct_wadd(x1, t1e, t2e); \
2524 dct_wsub(x2, t1e, t2e); \
2526 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2527 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2528 __m128i sum17 = _mm_add_epi16(row1, row7); \
2529 __m128i sum35 = _mm_add_epi16(row3, row5); \
2530 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2531 dct_wadd(x4, y0o, y4o); \
2532 dct_wadd(x5, y1o, y5o); \
2533 dct_wadd(x6, y2o, y5o); \
2534 dct_wadd(x7, y3o, y4o); \
2535 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2536 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2537 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2538 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2541 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2542 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2543 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2544 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2545 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2546 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2547 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2548 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2550 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2551 __m128i bias_0 = _mm_set1_epi32(512);
2552 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2555 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2556 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2557 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2558 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2559 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2560 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2561 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2562 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2565 dct_pass(bias_0, 10);
2568 // 16bit 8x8 transpose pass 1
2569 dct_interleave16(row0, row4);
2570 dct_interleave16(row1, row5);
2571 dct_interleave16(row2, row6);
2572 dct_interleave16(row3, row7);
2575 dct_interleave16(row0, row2);
2576 dct_interleave16(row1, row3);
2577 dct_interleave16(row4, row6);
2578 dct_interleave16(row5, row7);
2581 dct_interleave16(row0, row1);
2582 dct_interleave16(row2, row3);
2583 dct_interleave16(row4, row5);
2584 dct_interleave16(row6, row7);
2588 dct_pass(bias_1, 17);
2592 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2593 __m128i p1 = _mm_packus_epi16(row2, row3);
2594 __m128i p2 = _mm_packus_epi16(row4, row5);
2595 __m128i p3 = _mm_packus_epi16(row6, row7);
2597 // 8bit 8x8 transpose pass 1
2598 dct_interleave8(p0, p2); // a0e0a1e1...
2599 dct_interleave8(p1, p3); // c0g0c1g1...
2602 dct_interleave8(p0, p1); // a0c0e0g0...
2603 dct_interleave8(p2, p3); // b0d0f0h0...
2606 dct_interleave8(p0, p2); // a0b0c0d0...
2607 dct_interleave8(p1, p3); // a4b4c4d4...
2610 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2611 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2612 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2613 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2614 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2615 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2616 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2617 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2626 #undef dct_interleave8
2627 #undef dct_interleave16
2635 // NEON integer IDCT. should produce bit-identical
2636 // results to the generic C version.
2637 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2639 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2641 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2642 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2643 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2644 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2645 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2646 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2647 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2648 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2649 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2650 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2651 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2652 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2654 #define dct_long_mul(out, inq, coeff) \
2655 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2656 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2658 #define dct_long_mac(out, acc, inq, coeff) \
2659 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2660 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2662 #define dct_widen(out, inq) \
2663 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2664 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2667 #define dct_wadd(out, a, b) \
2668 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2669 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2672 #define dct_wsub(out, a, b) \
2673 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2674 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2676 // butterfly a/b, then shift using "shiftop" by "s" and pack
2677 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2679 dct_wadd(sum, a, b); \
2680 dct_wsub(dif, a, b); \
2681 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2682 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2685 #define dct_pass(shiftop, shift) \
2688 int16x8_t sum26 = vaddq_s16(row2, row6); \
2689 dct_long_mul(p1e, sum26, rot0_0); \
2690 dct_long_mac(t2e, p1e, row6, rot0_1); \
2691 dct_long_mac(t3e, p1e, row2, rot0_2); \
2692 int16x8_t sum04 = vaddq_s16(row0, row4); \
2693 int16x8_t dif04 = vsubq_s16(row0, row4); \
2694 dct_widen(t0e, sum04); \
2695 dct_widen(t1e, dif04); \
2696 dct_wadd(x0, t0e, t3e); \
2697 dct_wsub(x3, t0e, t3e); \
2698 dct_wadd(x1, t1e, t2e); \
2699 dct_wsub(x2, t1e, t2e); \
2701 int16x8_t sum15 = vaddq_s16(row1, row5); \
2702 int16x8_t sum17 = vaddq_s16(row1, row7); \
2703 int16x8_t sum35 = vaddq_s16(row3, row5); \
2704 int16x8_t sum37 = vaddq_s16(row3, row7); \
2705 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2706 dct_long_mul(p5o, sumodd, rot1_0); \
2707 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2708 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2709 dct_long_mul(p3o, sum37, rot2_0); \
2710 dct_long_mul(p4o, sum15, rot2_1); \
2711 dct_wadd(sump13o, p1o, p3o); \
2712 dct_wadd(sump24o, p2o, p4o); \
2713 dct_wadd(sump23o, p2o, p3o); \
2714 dct_wadd(sump14o, p1o, p4o); \
2715 dct_long_mac(x4, sump13o, row7, rot3_0); \
2716 dct_long_mac(x5, sump24o, row5, rot3_1); \
2717 dct_long_mac(x6, sump23o, row3, rot3_2); \
2718 dct_long_mac(x7, sump14o, row1, rot3_3); \
2719 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2720 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2721 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2722 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2726 row0 = vld1q_s16(data + 0*8);
2727 row1 = vld1q_s16(data + 1*8);
2728 row2 = vld1q_s16(data + 2*8);
2729 row3 = vld1q_s16(data + 3*8);
2730 row4 = vld1q_s16(data + 4*8);
2731 row5 = vld1q_s16(data + 5*8);
2732 row6 = vld1q_s16(data + 6*8);
2733 row7 = vld1q_s16(data + 7*8);
2736 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2739 dct_pass(vrshrn_n_s32, 10);
2741 // 16bit 8x8 transpose
2743 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2744 // whether compilers actually get this is another story, sadly.
2745 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2746 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2747 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2750 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2751 dct_trn16(row2, row3);
2752 dct_trn16(row4, row5);
2753 dct_trn16(row6, row7);
2756 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2757 dct_trn32(row1, row3);
2758 dct_trn32(row4, row6);
2759 dct_trn32(row5, row7);
2762 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2763 dct_trn64(row1, row5);
2764 dct_trn64(row2, row6);
2765 dct_trn64(row3, row7);
2773 // vrshrn_n_s32 only supports shifts up to 16, we need
2774 // 17. so do a non-rounding shift of 16 first then follow
2775 // up with a rounding shift by 1.
2776 dct_pass(vshrn_n_s32, 16);
2780 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2781 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2782 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2783 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2784 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2785 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2786 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2787 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2789 // again, these can translate into one instruction, but often don't.
2790 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2791 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2792 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2794 // sadly can't use interleaved stores here since we only write
2795 // 8 bytes to each scan line!
2797 // 8x8 8-bit transpose pass 1
2804 dct_trn8_16(p0, p2);
2805 dct_trn8_16(p1, p3);
2806 dct_trn8_16(p4, p6);
2807 dct_trn8_16(p5, p7);
2810 dct_trn8_32(p0, p4);
2811 dct_trn8_32(p1, p5);
2812 dct_trn8_32(p2, p6);
2813 dct_trn8_32(p3, p7);
2816 vst1_u8(out, p0); out += out_stride;
2817 vst1_u8(out, p1); out += out_stride;
2818 vst1_u8(out, p2); out += out_stride;
2819 vst1_u8(out, p3); out += out_stride;
2820 vst1_u8(out, p4); out += out_stride;
2821 vst1_u8(out, p5); out += out_stride;
2822 vst1_u8(out, p6); out += out_stride;
2841 #define STBI__MARKER_none 0xff
2842 // if there's a pending marker from the entropy stream, return that
2843 // otherwise, fetch from the stream and get a marker. if there's no
2844 // marker, return 0xff, which is never a valid marker value
2845 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2848 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2849 x = stbi__get8(j->s);
2850 if (x != 0xff) return STBI__MARKER_none;
2852 x = stbi__get8(j->s); // consume repeated 0xff fill bytes
2856 // in each scan, we'll have scan_n components, and the order
2857 // of the components is specified by order[]
2858 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2860 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2861 // the dc prediction
2862 static void stbi__jpeg_reset(stbi__jpeg *j)
2867 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
2868 j->marker = STBI__MARKER_none;
2869 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2871 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2872 // since we don't even allow 1<<30 pixels
2875 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2877 stbi__jpeg_reset(z);
2878 if (!z->progressive) {
2879 if (z->scan_n == 1) {
2881 STBI_SIMD_ALIGN(short, data[64]);
2882 int n = z->order[0];
2883 // non-interleaved data, we just need to process one block at a time,
2884 // in trivial scanline order
2885 // number of blocks to do just depends on how many actual "pixels" this
2886 // component has, independent of interleaved MCU blocking and such
2887 int w = (z->img_comp[n].x+7) >> 3;
2888 int h = (z->img_comp[n].y+7) >> 3;
2889 for (j=0; j < h; ++j) {
2890 for (i=0; i < w; ++i) {
2891 int ha = z->img_comp[n].ha;
2892 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2893 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2894 // every data block is an MCU, so countdown the restart interval
2895 if (--z->todo <= 0) {
2896 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2897 // if it's NOT a restart, then just bail, so we get corrupt data
2898 // rather than no data
2899 if (!STBI__RESTART(z->marker)) return 1;
2900 stbi__jpeg_reset(z);
2905 } else { // interleaved
2907 STBI_SIMD_ALIGN(short, data[64]);
2908 for (j=0; j < z->img_mcu_y; ++j) {
2909 for (i=0; i < z->img_mcu_x; ++i) {
2910 // scan an interleaved mcu... process scan_n components in order
2911 for (k=0; k < z->scan_n; ++k) {
2912 int n = z->order[k];
2913 // scan out an mcu's worth of this component; that's just determined
2914 // by the basic H and V specified for the component
2915 for (y=0; y < z->img_comp[n].v; ++y) {
2916 for (x=0; x < z->img_comp[n].h; ++x) {
2917 int x2 = (i*z->img_comp[n].h + x)*8;
2918 int y2 = (j*z->img_comp[n].v + y)*8;
2919 int ha = z->img_comp[n].ha;
2920 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2921 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2925 // after all interleaved components, that's an interleaved MCU,
2926 // so now count down the restart interval
2927 if (--z->todo <= 0) {
2928 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2929 if (!STBI__RESTART(z->marker)) return 1;
2930 stbi__jpeg_reset(z);
2937 if (z->scan_n == 1) {
2939 int n = z->order[0];
2940 // non-interleaved data, we just need to process one block at a time,
2941 // in trivial scanline order
2942 // number of blocks to do just depends on how many actual "pixels" this
2943 // component has, independent of interleaved MCU blocking and such
2944 int w = (z->img_comp[n].x+7) >> 3;
2945 int h = (z->img_comp[n].y+7) >> 3;
2946 for (j=0; j < h; ++j) {
2947 for (i=0; i < w; ++i) {
2948 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2949 if (z->spec_start == 0) {
2950 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2953 int ha = z->img_comp[n].ha;
2954 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2957 // every data block is an MCU, so countdown the restart interval
2958 if (--z->todo <= 0) {
2959 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2960 if (!STBI__RESTART(z->marker)) return 1;
2961 stbi__jpeg_reset(z);
2966 } else { // interleaved
2968 for (j=0; j < z->img_mcu_y; ++j) {
2969 for (i=0; i < z->img_mcu_x; ++i) {
2970 // scan an interleaved mcu... process scan_n components in order
2971 for (k=0; k < z->scan_n; ++k) {
2972 int n = z->order[k];
2973 // scan out an mcu's worth of this component; that's just determined
2974 // by the basic H and V specified for the component
2975 for (y=0; y < z->img_comp[n].v; ++y) {
2976 for (x=0; x < z->img_comp[n].h; ++x) {
2977 int x2 = (i*z->img_comp[n].h + x);
2978 int y2 = (j*z->img_comp[n].v + y);
2979 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2980 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2985 // after all interleaved components, that's an interleaved MCU,
2986 // so now count down the restart interval
2987 if (--z->todo <= 0) {
2988 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2989 if (!STBI__RESTART(z->marker)) return 1;
2990 stbi__jpeg_reset(z);
2999 static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant)
3002 for (i=0; i < 64; ++i)
3003 data[i] *= dequant[i];
3006 static void stbi__jpeg_finish(stbi__jpeg *z)
3008 if (z->progressive) {
3009 // dequantize and idct the data
3011 for (n=0; n < z->s->img_n; ++n) {
3012 int w = (z->img_comp[n].x+7) >> 3;
3013 int h = (z->img_comp[n].y+7) >> 3;
3014 for (j=0; j < h; ++j) {
3015 for (i=0; i < w; ++i) {
3016 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
3017 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
3018 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
3025 static int stbi__process_marker(stbi__jpeg *z, int m)
3029 case STBI__MARKER_none: // no marker found
3030 return stbi__err("expected marker","Corrupt JPEG");
3032 case 0xDD: // DRI - specify restart interval
3033 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
3034 z->restart_interval = stbi__get16be(z->s);
3037 case 0xDB: // DQT - define quantization table
3038 L = stbi__get16be(z->s)-2;
3040 int q = stbi__get8(z->s);
3041 int p = q >> 4, sixteen = (p != 0);
3043 if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG");
3044 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
3046 for (i=0; i < 64; ++i)
3047 z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
3048 L -= (sixteen ? 129 : 65);
3052 case 0xC4: // DHT - define huffman table
3053 L = stbi__get16be(z->s)-2;
3056 int sizes[16],i,n=0;
3057 int q = stbi__get8(z->s);
3060 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
3061 for (i=0; i < 16; ++i) {
3062 sizes[i] = stbi__get8(z->s);
3067 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
3068 v = z->huff_dc[th].values;
3070 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
3071 v = z->huff_ac[th].values;
3073 for (i=0; i < n; ++i)
3074 v[i] = stbi__get8(z->s);
3076 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
3082 // check for comment block or APP blocks
3083 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
3084 L = stbi__get16be(z->s);
3087 return stbi__err("bad COM len","Corrupt JPEG");
3089 return stbi__err("bad APP len","Corrupt JPEG");
3093 if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
3094 static const unsigned char tag[5] = {'J','F','I','F','\0'};
3097 for (i=0; i < 5; ++i)
3098 if (stbi__get8(z->s) != tag[i])
3103 } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
3104 static const unsigned char tag[6] = {'A','d','o','b','e','\0'};
3107 for (i=0; i < 6; ++i)
3108 if (stbi__get8(z->s) != tag[i])
3112 stbi__get8(z->s); // version
3113 stbi__get16be(z->s); // flags0
3114 stbi__get16be(z->s); // flags1
3115 z->app14_color_transform = stbi__get8(z->s); // color transform
3120 stbi__skip(z->s, L);
3124 return stbi__err("unknown marker","Corrupt JPEG");
3128 static int stbi__process_scan_header(stbi__jpeg *z)
3131 int Ls = stbi__get16be(z->s);
3132 z->scan_n = stbi__get8(z->s);
3133 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
3134 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
3135 for (i=0; i < z->scan_n; ++i) {
3136 int id = stbi__get8(z->s), which;
3137 int q = stbi__get8(z->s);
3138 for (which = 0; which < z->s->img_n; ++which)
3139 if (z->img_comp[which].id == id)
3141 if (which == z->s->img_n) return 0; // no match
3142 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
3143 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
3144 z->order[i] = which;
3149 z->spec_start = stbi__get8(z->s);
3150 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
3151 aa = stbi__get8(z->s);
3152 z->succ_high = (aa >> 4);
3153 z->succ_low = (aa & 15);
3154 if (z->progressive) {
3155 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
3156 return stbi__err("bad SOS", "Corrupt JPEG");
3158 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
3159 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
3167 static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why)
3170 for (i=0; i < ncomp; ++i) {
3171 if (z->img_comp[i].raw_data) {
3172 STBI_FREE(z->img_comp[i].raw_data);
3173 z->img_comp[i].raw_data = NULL;
3174 z->img_comp[i].data = NULL;
3176 if (z->img_comp[i].raw_coeff) {
3177 STBI_FREE(z->img_comp[i].raw_coeff);
3178 z->img_comp[i].raw_coeff = 0;
3179 z->img_comp[i].coeff = 0;
3181 if (z->img_comp[i].linebuf) {
3182 STBI_FREE(z->img_comp[i].linebuf);
3183 z->img_comp[i].linebuf = NULL;
3189 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
3191 stbi__context *s = z->s;
3192 int Lf,p,i,q, h_max=1,v_max=1,c;
3193 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
3194 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
3195 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
3196 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
3197 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
3198 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
3200 if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG");
3202 for (i=0; i < c; ++i) {
3203 z->img_comp[i].data = NULL;
3204 z->img_comp[i].linebuf = NULL;
3207 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
3210 for (i=0; i < s->img_n; ++i) {
3211 static const unsigned char rgb[3] = { 'R', 'G', 'B' };
3212 z->img_comp[i].id = stbi__get8(s);
3213 if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
3216 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
3217 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
3218 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
3221 if (scan != STBI__SCAN_load) return 1;
3223 if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
3225 for (i=0; i < s->img_n; ++i) {
3226 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
3227 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
3230 // compute interleaved mcu info
3231 z->img_h_max = h_max;
3232 z->img_v_max = v_max;
3233 z->img_mcu_w = h_max * 8;
3234 z->img_mcu_h = v_max * 8;
3235 // these sizes can't be more than 17 bits
3236 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
3237 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
3239 for (i=0; i < s->img_n; ++i) {
3240 // number of effective pixels (e.g. for non-interleaved MCU)
3241 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
3242 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
3243 // to simplify generation, we'll allocate enough memory to decode
3244 // the bogus oversized data from using interleaved MCUs and their
3245 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
3246 // discard the extra data until colorspace conversion
3248 // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
3249 // so these muls can't overflow with 32-bit ints (which we require)
3250 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
3251 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
3252 z->img_comp[i].coeff = 0;
3253 z->img_comp[i].raw_coeff = 0;
3254 z->img_comp[i].linebuf = NULL;
3255 z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
3256 if (z->img_comp[i].raw_data == NULL)
3257 return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
3258 // align blocks for idct using mmx/sse
3259 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
3260 if (z->progressive) {
3261 // w2, h2 are multiples of 8 (see above)
3262 z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
3263 z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
3264 z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
3265 if (z->img_comp[i].raw_coeff == NULL)
3266 return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
3267 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
3274 // use comparisons since in some cases we handle more than one case (e.g. SOF)
3275 #define stbi__DNL(x) ((x) == 0xdc)
3276 #define stbi__SOI(x) ((x) == 0xd8)
3277 #define stbi__EOI(x) ((x) == 0xd9)
3278 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
3279 #define stbi__SOS(x) ((x) == 0xda)
3281 #define stbi__SOF_progressive(x) ((x) == 0xc2)
3283 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
3287 z->app14_color_transform = -1; // valid values are 0,1,2
3288 z->marker = STBI__MARKER_none; // initialize cached marker to empty
3289 m = stbi__get_marker(z);
3290 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
3291 if (scan == STBI__SCAN_type) return 1;
3292 m = stbi__get_marker(z);
3293 while (!stbi__SOF(m)) {
3294 if (!stbi__process_marker(z,m)) return 0;
3295 m = stbi__get_marker(z);
3296 while (m == STBI__MARKER_none) {
3297 // some files have extra padding after their blocks, so ok, we'll scan
3298 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
3299 m = stbi__get_marker(z);
3302 z->progressive = stbi__SOF_progressive(m);
3303 if (!stbi__process_frame_header(z, scan)) return 0;
3307 // decode image to YCbCr format
3308 static int stbi__decode_jpeg_image(stbi__jpeg *j)
3311 for (m = 0; m < 4; m++) {
3312 j->img_comp[m].raw_data = NULL;
3313 j->img_comp[m].raw_coeff = NULL;
3315 j->restart_interval = 0;
3316 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
3317 m = stbi__get_marker(j);
3318 while (!stbi__EOI(m)) {
3320 if (!stbi__process_scan_header(j)) return 0;
3321 if (!stbi__parse_entropy_coded_data(j)) return 0;
3322 if (j->marker == STBI__MARKER_none ) {
3323 // handle 0s at the end of image data from IP Kamera 9060
3324 while (!stbi__at_eof(j->s)) {
3325 int x = stbi__get8(j->s);
3327 j->marker = stbi__get8(j->s);
3331 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
3333 } else if (stbi__DNL(m)) {
3334 int Ld = stbi__get16be(j->s);
3335 stbi__uint32 NL = stbi__get16be(j->s);
3336 if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
3337 if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
3339 if (!stbi__process_marker(j, m)) return 0;
3341 m = stbi__get_marker(j);
3344 stbi__jpeg_finish(j);
3348 // static jfif-centered resampling (across block boundaries)
3350 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
3353 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
3355 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3358 STBI_NOTUSED(in_far);
3364 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3366 // need to generate two samples vertically for every one in input
3369 for (i=0; i < w; ++i)
3370 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
3374 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3376 // need to generate two samples horizontally for every one in input
3378 stbi_uc *input = in_near;
3381 // if only one sample, can't do any interpolation
3382 out[0] = out[1] = input[0];
3387 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
3388 for (i=1; i < w-1; ++i) {
3389 int n = 3*input[i]+2;
3390 out[i*2+0] = stbi__div4(n+input[i-1]);
3391 out[i*2+1] = stbi__div4(n+input[i+1]);
3393 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
3394 out[i*2+1] = input[w-1];
3396 STBI_NOTUSED(in_far);
3402 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
3404 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3406 // need to generate 2x2 samples for every one in input
3409 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
3413 t1 = 3*in_near[0] + in_far[0];
3414 out[0] = stbi__div4(t1+2);
3415 for (i=1; i < w; ++i) {
3417 t1 = 3*in_near[i]+in_far[i];
3418 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3419 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3421 out[w*2-1] = stbi__div4(t1+2);
3428 #if defined(STBI_SSE2) || defined(STBI_NEON)
3429 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3431 // need to generate 2x2 samples for every one in input
3435 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
3439 t1 = 3*in_near[0] + in_far[0];
3440 // process groups of 8 pixels for as long as we can.
3441 // note we can't handle the last pixel in a row in this loop
3442 // because we need to handle the filter boundary conditions.
3443 for (; i < ((w-1) & ~7); i += 8) {
3444 #if defined(STBI_SSE2)
3445 // load and perform the vertical filtering pass
3446 // this uses 3*x + y = 4*x + (y - x)
3447 __m128i zero = _mm_setzero_si128();
3448 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
3449 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
3450 __m128i farw = _mm_unpacklo_epi8(farb, zero);
3451 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
3452 __m128i diff = _mm_sub_epi16(farw, nearw);
3453 __m128i nears = _mm_slli_epi16(nearw, 2);
3454 __m128i curr = _mm_add_epi16(nears, diff); // current row
3456 // horizontal filter works the same based on shifted vers of current
3457 // row. "prev" is current row shifted right by 1 pixel; we need to
3458 // insert the previous pixel value (from t1).
3459 // "next" is current row shifted left by 1 pixel, with first pixel
3460 // of next block of 8 pixels added in.
3461 __m128i prv0 = _mm_slli_si128(curr, 2);
3462 __m128i nxt0 = _mm_srli_si128(curr, 2);
3463 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
3464 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
3466 // horizontal filter, polyphase implementation since it's convenient:
3467 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3468 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3469 // note the shared term.
3470 __m128i bias = _mm_set1_epi16(8);
3471 __m128i curs = _mm_slli_epi16(curr, 2);
3472 __m128i prvd = _mm_sub_epi16(prev, curr);
3473 __m128i nxtd = _mm_sub_epi16(next, curr);
3474 __m128i curb = _mm_add_epi16(curs, bias);
3475 __m128i even = _mm_add_epi16(prvd, curb);
3476 __m128i odd = _mm_add_epi16(nxtd, curb);
3478 // interleave even and odd pixels, then undo scaling.
3479 __m128i int0 = _mm_unpacklo_epi16(even, odd);
3480 __m128i int1 = _mm_unpackhi_epi16(even, odd);
3481 __m128i de0 = _mm_srli_epi16(int0, 4);
3482 __m128i de1 = _mm_srli_epi16(int1, 4);
3484 // pack and write output
3485 __m128i outv = _mm_packus_epi16(de0, de1);
3486 _mm_storeu_si128((__m128i *) (out + i*2), outv);
3487 #elif defined(STBI_NEON)
3488 // load and perform the vertical filtering pass
3489 // this uses 3*x + y = 4*x + (y - x)
3490 uint8x8_t farb = vld1_u8(in_far + i);
3491 uint8x8_t nearb = vld1_u8(in_near + i);
3492 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3493 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3494 int16x8_t curr = vaddq_s16(nears, diff); // current row
3496 // horizontal filter works the same based on shifted vers of current
3497 // row. "prev" is current row shifted right by 1 pixel; we need to
3498 // insert the previous pixel value (from t1).
3499 // "next" is current row shifted left by 1 pixel, with first pixel
3500 // of next block of 8 pixels added in.
3501 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3502 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3503 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3504 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3506 // horizontal filter, polyphase implementation since it's convenient:
3507 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3508 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3509 // note the shared term.
3510 int16x8_t curs = vshlq_n_s16(curr, 2);
3511 int16x8_t prvd = vsubq_s16(prev, curr);
3512 int16x8_t nxtd = vsubq_s16(next, curr);
3513 int16x8_t even = vaddq_s16(curs, prvd);
3514 int16x8_t odd = vaddq_s16(curs, nxtd);
3516 // undo scaling and round, then store with even/odd phases interleaved
3518 o.val[0] = vqrshrun_n_s16(even, 4);
3519 o.val[1] = vqrshrun_n_s16(odd, 4);
3520 vst2_u8(out + i*2, o);
3523 // "previous" value for next iter
3524 t1 = 3*in_near[i+7] + in_far[i+7];
3528 t1 = 3*in_near[i] + in_far[i];
3529 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3531 for (++i; i < w; ++i) {
3533 t1 = 3*in_near[i]+in_far[i];
3534 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3535 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3537 out[w*2-1] = stbi__div4(t1+2);
3545 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3547 // resample with nearest-neighbor
3549 STBI_NOTUSED(in_far);
3550 for (i=0; i < w; ++i)
3551 for (j=0; j < hs; ++j)
3552 out[i*hs+j] = in_near[i];
3556 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3557 // to make sure the code produces the same results in both SIMD and scalar
3558 #define stbi__float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3559 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3562 for (i=0; i < count; ++i) {
3563 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3565 int cr = pcr[i] - 128;
3566 int cb = pcb[i] - 128;
3567 r = y_fixed + cr* stbi__float2fixed(1.40200f);
3568 g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
3569 b = y_fixed + cb* stbi__float2fixed(1.77200f);
3573 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3574 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3575 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3576 out[0] = (stbi_uc)r;
3577 out[1] = (stbi_uc)g;
3578 out[2] = (stbi_uc)b;
3584 #if defined(STBI_SSE2) || defined(STBI_NEON)
3585 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3590 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3591 // it's useful in practice (you wouldn't use it for textures, for example).
3592 // so just accelerate step == 4 case.
3594 // this is a fairly straightforward implementation and not super-optimized.
3595 __m128i signflip = _mm_set1_epi8(-0x80);
3596 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3597 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3598 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3599 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3600 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3601 __m128i xw = _mm_set1_epi16(255); // alpha channel
3603 for (; i+7 < count; i += 8) {
3605 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3606 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3607 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3608 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3609 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3611 // unpack to short (and left-shift cr, cb by 8)
3612 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3613 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3614 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3617 __m128i yws = _mm_srli_epi16(yw, 4);
3618 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3619 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3620 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3621 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3622 __m128i rws = _mm_add_epi16(cr0, yws);
3623 __m128i gwt = _mm_add_epi16(cb0, yws);
3624 __m128i bws = _mm_add_epi16(yws, cb1);
3625 __m128i gws = _mm_add_epi16(gwt, cr1);
3628 __m128i rw = _mm_srai_epi16(rws, 4);
3629 __m128i bw = _mm_srai_epi16(bws, 4);
3630 __m128i gw = _mm_srai_epi16(gws, 4);
3632 // back to byte, set up for transpose
3633 __m128i brb = _mm_packus_epi16(rw, bw);
3634 __m128i gxb = _mm_packus_epi16(gw, xw);
3636 // transpose to interleave channels
3637 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3638 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3639 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3640 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3643 _mm_storeu_si128((__m128i *) (out + 0), o0);
3644 _mm_storeu_si128((__m128i *) (out + 16), o1);
3651 // in this version, step=3 support would be easy to add. but is there demand?
3653 // this is a fairly straightforward implementation and not super-optimized.
3654 uint8x8_t signflip = vdup_n_u8(0x80);
3655 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3656 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3657 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3658 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3660 for (; i+7 < count; i += 8) {
3662 uint8x8_t y_bytes = vld1_u8(y + i);
3663 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3664 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3665 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3666 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3669 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3670 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3671 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3674 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3675 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3676 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3677 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3678 int16x8_t rws = vaddq_s16(yws, cr0);
3679 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3680 int16x8_t bws = vaddq_s16(yws, cb1);
3682 // undo scaling, round, convert to byte
3684 o.val[0] = vqrshrun_n_s16(rws, 4);
3685 o.val[1] = vqrshrun_n_s16(gws, 4);
3686 o.val[2] = vqrshrun_n_s16(bws, 4);
3687 o.val[3] = vdup_n_u8(255);
3689 // store, interleaving r/g/b/a
3696 for (; i < count; ++i) {
3697 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3699 int cr = pcr[i] - 128;
3700 int cb = pcb[i] - 128;
3701 r = y_fixed + cr* stbi__float2fixed(1.40200f);
3702 g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
3703 b = y_fixed + cb* stbi__float2fixed(1.77200f);
3707 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3708 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3709 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3710 out[0] = (stbi_uc)r;
3711 out[1] = (stbi_uc)g;
3712 out[2] = (stbi_uc)b;
3719 // set up the kernels
3720 static void stbi__setup_jpeg(stbi__jpeg *j)
3722 j->idct_block_kernel = stbi__idct_block;
3723 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3724 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3727 if (stbi__sse2_available()) {
3728 j->idct_block_kernel = stbi__idct_simd;
3729 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3730 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3735 j->idct_block_kernel = stbi__idct_simd;
3736 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3737 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3741 // clean up the temporary component buffers
3742 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3744 stbi__free_jpeg_components(j, j->s->img_n, 0);
3749 resample_row_func resample;
3750 stbi_uc *line0,*line1;
3751 int hs,vs; // expansion factor in each axis
3752 int w_lores; // horizontal pixels pre-expansion
3753 int ystep; // how far through vertical expansion we are
3754 int ypos; // which pre-expansion row we're on
3757 // fast 0..255 * 0..255 => 0..255 rounded multiplication
3758 static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
3760 unsigned int t = x*y + 128;
3761 return (stbi_uc) ((t + (t >>8)) >> 8);
3764 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3766 int n, decode_n, is_rgb;
3767 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3769 // validate req_comp
3770 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3772 // load a jpeg image from whichever source, but leave in YCbCr format
3773 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3775 // determine actual number of components to generate
3776 n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
3778 is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
3780 if (z->s->img_n == 3 && n < 3 && !is_rgb)
3783 decode_n = z->s->img_n;
3785 // resample and color-convert
3790 stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL };
3792 stbi__resample res_comp[4];
3794 for (k=0; k < decode_n; ++k) {
3795 stbi__resample *r = &res_comp[k];
3797 // allocate line buffer big enough for upsampling off the edges
3798 // with upsample factor of 4
3799 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3800 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3802 r->hs = z->img_h_max / z->img_comp[k].h;
3803 r->vs = z->img_v_max / z->img_comp[k].v;
3804 r->ystep = r->vs >> 1;
3805 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3807 r->line0 = r->line1 = z->img_comp[k].data;
3809 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3810 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3811 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3812 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3813 else r->resample = stbi__resample_row_generic;
3816 // can't error after this so, this is safe
3817 output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
3818 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3820 // now go ahead and resample
3821 for (j=0; j < z->s->img_y; ++j) {
3822 stbi_uc *out = output + n * z->s->img_x * j;
3823 for (k=0; k < decode_n; ++k) {
3824 stbi__resample *r = &res_comp[k];
3825 int y_bot = r->ystep >= (r->vs >> 1);
3826 coutput[k] = r->resample(z->img_comp[k].linebuf,
3827 y_bot ? r->line1 : r->line0,
3828 y_bot ? r->line0 : r->line1,
3830 if (++r->ystep >= r->vs) {
3832 r->line0 = r->line1;
3833 if (++r->ypos < z->img_comp[k].y)
3834 r->line1 += z->img_comp[k].w2;
3838 stbi_uc *y = coutput[0];
3839 if (z->s->img_n == 3) {
3841 for (i=0; i < z->s->img_x; ++i) {
3843 out[1] = coutput[1][i];
3844 out[2] = coutput[2][i];
3849 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3851 } else if (z->s->img_n == 4) {
3852 if (z->app14_color_transform == 0) { // CMYK
3853 for (i=0; i < z->s->img_x; ++i) {
3854 stbi_uc m = coutput[3][i];
3855 out[0] = stbi__blinn_8x8(coutput[0][i], m);
3856 out[1] = stbi__blinn_8x8(coutput[1][i], m);
3857 out[2] = stbi__blinn_8x8(coutput[2][i], m);
3861 } else if (z->app14_color_transform == 2) { // YCCK
3862 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3863 for (i=0; i < z->s->img_x; ++i) {
3864 stbi_uc m = coutput[3][i];
3865 out[0] = stbi__blinn_8x8(255 - out[0], m);
3866 out[1] = stbi__blinn_8x8(255 - out[1], m);
3867 out[2] = stbi__blinn_8x8(255 - out[2], m);
3870 } else { // YCbCr + alpha? Ignore the fourth channel for now
3871 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3874 for (i=0; i < z->s->img_x; ++i) {
3875 out[0] = out[1] = out[2] = y[i];
3876 out[3] = 255; // not used if n==3
3882 for (i=0; i < z->s->img_x; ++i)
3883 *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
3885 for (i=0; i < z->s->img_x; ++i, out += 2) {
3886 out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
3890 } else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
3891 for (i=0; i < z->s->img_x; ++i) {
3892 stbi_uc m = coutput[3][i];
3893 stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
3894 stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
3895 stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
3896 out[0] = stbi__compute_y(r, g, b);
3900 } else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
3901 for (i=0; i < z->s->img_x; ++i) {
3902 out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
3907 stbi_uc *y = coutput[0];
3909 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3911 for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; }
3915 stbi__cleanup_jpeg(z);
3916 *out_x = z->s->img_x;
3917 *out_y = z->s->img_y;
3918 if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
3923 static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
3925 unsigned char* result;
3926 stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
3929 stbi__setup_jpeg(j);
3930 result = load_jpeg_image(j, x,y,comp,req_comp);
3935 static int stbi__jpeg_test(stbi__context *s)
3938 stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
3940 stbi__setup_jpeg(j);
3941 r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
3947 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3949 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3950 stbi__rewind( j->s );
3953 if (x) *x = j->s->img_x;
3954 if (y) *y = j->s->img_y;
3955 if (comp) *comp = j->s->img_n >= 3 ? 3 : 1;
3959 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3962 stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
3964 result = stbi__jpeg_info_raw(j, x, y, comp);
3970 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3971 // simple implementation
3972 // - all input must be provided in an upfront buffer
3973 // - all output is written to a single output buffer (can malloc/realloc)
3977 #ifndef STBI_NO_ZLIB
3979 // fast-way is faster to check than jpeg huffman, but slow way is slower
3980 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3981 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3983 // zlib-style huffman encoding
3984 // (jpegs packs from left, zlib from right, so can't share code)
3987 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3988 stbi__uint16 firstcode[16];
3990 stbi__uint16 firstsymbol[16];
3992 stbi__uint16 value[288];
3995 stbi_inline static int stbi__bitreverse16(int n)
3997 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3998 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3999 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
4000 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
4004 stbi_inline static int stbi__bit_reverse(int v, int bits)
4006 STBI_ASSERT(bits <= 16);
4007 // to bit reverse n bits, reverse 16 and shift
4008 // e.g. 11 bits, bit reverse and shift away 5
4009 return stbi__bitreverse16(v) >> (16-bits);
4012 static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num)
4015 int code, next_code[16], sizes[17];
4017 // DEFLATE spec for generating codes
4018 memset(sizes, 0, sizeof(sizes));
4019 memset(z->fast, 0, sizeof(z->fast));
4020 for (i=0; i < num; ++i)
4021 ++sizes[sizelist[i]];
4023 for (i=1; i < 16; ++i)
4024 if (sizes[i] > (1 << i))
4025 return stbi__err("bad sizes", "Corrupt PNG");
4027 for (i=1; i < 16; ++i) {
4028 next_code[i] = code;
4029 z->firstcode[i] = (stbi__uint16) code;
4030 z->firstsymbol[i] = (stbi__uint16) k;
4031 code = (code + sizes[i]);
4033 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
4034 z->maxcode[i] = code << (16-i); // preshift for inner loop
4038 z->maxcode[16] = 0x10000; // sentinel
4039 for (i=0; i < num; ++i) {
4040 int s = sizelist[i];
4042 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
4043 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
4044 z->size [c] = (stbi_uc ) s;
4045 z->value[c] = (stbi__uint16) i;
4046 if (s <= STBI__ZFAST_BITS) {
4047 int j = stbi__bit_reverse(next_code[s],s);
4048 while (j < (1 << STBI__ZFAST_BITS)) {
4059 // zlib-from-memory implementation for PNG reading
4060 // because PNG allows splitting the zlib stream arbitrarily,
4061 // and it's annoying structurally to have PNG call ZLIB call PNG,
4062 // we require PNG read all the IDATs and combine them into a single
4067 stbi_uc *zbuffer, *zbuffer_end;
4069 stbi__uint32 code_buffer;
4076 stbi__zhuffman z_length, z_distance;
4079 stbi_inline static int stbi__zeof(stbi__zbuf *z)
4081 return (z->zbuffer >= z->zbuffer_end);
4084 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
4086 return stbi__zeof(z) ? 0 : *z->zbuffer++;
4089 static void stbi__fill_bits(stbi__zbuf *z)
4092 if (z->code_buffer >= (1U << z->num_bits)) {
4093 z->zbuffer = z->zbuffer_end; /* treat this as EOF so we fail. */
4096 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
4098 } while (z->num_bits <= 24);
4101 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
4104 if (z->num_bits < n) stbi__fill_bits(z);
4105 k = z->code_buffer & ((1 << n) - 1);
4106 z->code_buffer >>= n;
4111 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
4114 // not resolved by fast table, so compute it the slow way
4115 // use jpeg approach, which requires MSbits at top
4116 k = stbi__bit_reverse(a->code_buffer, 16);
4117 for (s=STBI__ZFAST_BITS+1; ; ++s)
4118 if (k < z->maxcode[s])
4120 if (s >= 16) return -1; // invalid code!
4121 // code size is s, so:
4122 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
4123 if (b >= sizeof (z->size)) return -1; // some data was corrupt somewhere!
4124 if (z->size[b] != s) return -1; // was originally an assert, but report failure instead.
4125 a->code_buffer >>= s;
4130 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
4133 if (a->num_bits < 16) {
4134 if (stbi__zeof(a)) {
4135 return -1; /* report error for unexpected end of data. */
4139 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
4142 a->code_buffer >>= s;
4146 return stbi__zhuffman_decode_slowpath(a, z);
4149 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
4152 unsigned int cur, limit, old_limit;
4154 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
4155 cur = (unsigned int) (z->zout - z->zout_start);
4156 limit = old_limit = (unsigned) (z->zout_end - z->zout_start);
4157 if (UINT_MAX - cur < (unsigned) n) return stbi__err("outofmem", "Out of memory");
4158 while (cur + n > limit) {
4159 if(limit > UINT_MAX / 2) return stbi__err("outofmem", "Out of memory");
4162 q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
4163 STBI_NOTUSED(old_limit);
4164 if (q == NULL) return stbi__err("outofmem", "Out of memory");
4167 z->zout_end = q + limit;
4171 static const int stbi__zlength_base[31] = {
4172 3,4,5,6,7,8,9,10,11,13,
4173 15,17,19,23,27,31,35,43,51,59,
4174 67,83,99,115,131,163,195,227,258,0,0 };
4176 static const int stbi__zlength_extra[31]=
4177 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
4179 static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
4180 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
4182 static const int stbi__zdist_extra[32] =
4183 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
4185 static int stbi__parse_huffman_block(stbi__zbuf *a)
4187 char *zout = a->zout;
4189 int z = stbi__zhuffman_decode(a, &a->z_length);
4191 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
4192 if (zout >= a->zout_end) {
4193 if (!stbi__zexpand(a, zout, 1)) return 0;
4205 len = stbi__zlength_base[z];
4206 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
4207 z = stbi__zhuffman_decode(a, &a->z_distance);
4208 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
4209 dist = stbi__zdist_base[z];
4210 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
4211 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
4212 if (zout + len > a->zout_end) {
4213 if (!stbi__zexpand(a, zout, len)) return 0;
4216 p = (stbi_uc *) (zout - dist);
4217 if (dist == 1) { // run of one byte; common in images.
4219 if (len) { do *zout++ = v; while (--len); }
4221 if (len) { do *zout++ = *p++; while (--len); }
4227 static int stbi__compute_huffman_codes(stbi__zbuf *a)
4229 static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
4230 stbi__zhuffman z_codelength;
4231 stbi_uc lencodes[286+32+137];//padding for maximum single op
4232 stbi_uc codelength_sizes[19];
4235 int hlit = stbi__zreceive(a,5) + 257;
4236 int hdist = stbi__zreceive(a,5) + 1;
4237 int hclen = stbi__zreceive(a,4) + 4;
4238 int ntot = hlit + hdist;
4240 memset(codelength_sizes, 0, sizeof(codelength_sizes));
4241 for (i=0; i < hclen; ++i) {
4242 int s = stbi__zreceive(a,3);
4243 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
4245 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
4249 int c = stbi__zhuffman_decode(a, &z_codelength);
4250 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
4252 lencodes[n++] = (stbi_uc) c;
4256 c = stbi__zreceive(a,2)+3;
4257 if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
4258 fill = lencodes[n-1];
4259 } else if (c == 17) {
4260 c = stbi__zreceive(a,3)+3;
4261 } else if (c == 18) {
4262 c = stbi__zreceive(a,7)+11;
4264 return stbi__err("bad codelengths", "Corrupt PNG");
4266 if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
4267 memset(lencodes+n, fill, c);
4271 if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG");
4272 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
4273 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
4277 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
4281 if (a->num_bits & 7)
4282 stbi__zreceive(a, a->num_bits & 7); // discard
4283 // drain the bit-packed data into header
4285 while (a->num_bits > 0) {
4286 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
4287 a->code_buffer >>= 8;
4290 if (a->num_bits < 0) return stbi__err("zlib corrupt","Corrupt PNG");
4291 // now fill header the normal way
4293 header[k++] = stbi__zget8(a);
4294 len = header[1] * 256 + header[0];
4295 nlen = header[3] * 256 + header[2];
4296 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
4297 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
4298 if (a->zout + len > a->zout_end)
4299 if (!stbi__zexpand(a, a->zout, len)) return 0;
4300 memcpy(a->zout, a->zbuffer, len);
4306 static int stbi__parse_zlib_header(stbi__zbuf *a)
4308 int cmf = stbi__zget8(a);
4310 /* int cinfo = cmf >> 4; */
4311 int flg = stbi__zget8(a);
4312 if (stbi__zeof(a)) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
4313 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
4314 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
4315 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
4316 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
4320 static const stbi_uc stbi__zdefault_length[288] =
4322 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4323 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4324 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4325 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4326 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4327 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4328 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4329 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4330 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
4332 static const stbi_uc stbi__zdefault_distance[32] =
4334 5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
4339 int i; // use <= to match clearly with spec
4340 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
4341 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
4342 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
4343 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
4345 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
4349 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
4353 if (!stbi__parse_zlib_header(a)) return 0;
4357 final = stbi__zreceive(a,1);
4358 type = stbi__zreceive(a,2);
4360 if (!stbi__parse_uncompressed_block(a)) return 0;
4361 } else if (type == 3) {
4365 // use fixed code lengths
4366 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
4367 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
4369 if (!stbi__compute_huffman_codes(a)) return 0;
4371 if (!stbi__parse_huffman_block(a)) return 0;
4377 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
4379 a->zout_start = obuf;
4381 a->zout_end = obuf + olen;
4382 a->z_expandable = exp;
4384 return stbi__parse_zlib(a, parse_header);
4387 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
4390 char *p = (char *) stbi__malloc(initial_size);
4391 if (p == NULL) return NULL;
4392 a.zbuffer = (stbi_uc *) buffer;
4393 a.zbuffer_end = (stbi_uc *) buffer + len;
4394 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
4395 if (outlen) *outlen = (int) (a.zout - a.zout_start);
4396 return a.zout_start;
4398 STBI_FREE(a.zout_start);
4403 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
4405 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
4408 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
4411 char *p = (char *) stbi__malloc(initial_size);
4412 if (p == NULL) return NULL;
4413 a.zbuffer = (stbi_uc *) buffer;
4414 a.zbuffer_end = (stbi_uc *) buffer + len;
4415 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
4416 if (outlen) *outlen = (int) (a.zout - a.zout_start);
4417 return a.zout_start;
4419 STBI_FREE(a.zout_start);
4424 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
4427 a.zbuffer = (stbi_uc *) ibuffer;
4428 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
4429 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
4430 return (int) (a.zout - a.zout_start);
4435 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
4438 char *p = (char *) stbi__malloc(16384);
4439 if (p == NULL) return NULL;
4440 a.zbuffer = (stbi_uc *) buffer;
4441 a.zbuffer_end = (stbi_uc *) buffer+len;
4442 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
4443 if (outlen) *outlen = (int) (a.zout - a.zout_start);
4444 return a.zout_start;
4446 STBI_FREE(a.zout_start);
4451 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
4454 a.zbuffer = (stbi_uc *) ibuffer;
4455 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
4456 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
4457 return (int) (a.zout - a.zout_start);
4463 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
4464 // simple implementation
4465 // - only 8-bit samples
4466 // - no CRC checking
4467 // - allocates lots of intermediate memory
4468 // - avoids problem of streaming data between subsystems
4469 // - avoids explicit window management
4471 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
4476 stbi__uint32 length;
4480 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
4483 c.length = stbi__get32be(s);
4484 c.type = stbi__get32be(s);
4488 static int stbi__check_png_header(stbi__context *s)
4490 static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
4492 for (i=0; i < 8; ++i)
4493 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
4500 stbi_uc *idata, *expanded, *out;
4511 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
4516 static stbi_uc first_row_filter[5] =
4525 static int stbi__paeth(int a, int b, int c)
4531 if (pa <= pb && pa <= pc) return a;
4532 if (pb <= pc) return b;
4536 static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4538 // create the png data from post-deflated data
4539 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4541 int bytes = (depth == 16? 2 : 1);
4542 stbi__context *s = a->s;
4543 stbi__uint32 i,j,stride = x*out_n*bytes;
4544 stbi__uint32 img_len, img_width_bytes;
4546 int img_n = s->img_n; // copy it into a local for later
4548 int output_bytes = out_n*bytes;
4549 int filter_bytes = img_n*bytes;
4552 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4553 a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
4554 if (!a->out) return stbi__err("outofmem", "Out of memory");
4556 if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
4557 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4558 img_len = (img_width_bytes + 1) * y;
4560 // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
4561 // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
4562 // so just check for raw_len < img_len always.
4563 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4565 for (j=0; j < y; ++j) {
4566 stbi_uc *cur = a->out + stride*j;
4568 int filter = *raw++;
4571 return stbi__err("invalid filter","Corrupt PNG");
4574 if (img_width_bytes > x) return stbi__err("invalid width","Corrupt PNG");
4575 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4577 width = img_width_bytes;
4579 prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above
4581 // if first row, use special filter that doesn't sample previous row
4582 if (j == 0) filter = first_row_filter[filter];
4584 // handle first byte explicitly
4585 for (k=0; k < filter_bytes; ++k) {
4587 case STBI__F_none : cur[k] = raw[k]; break;
4588 case STBI__F_sub : cur[k] = raw[k]; break;
4589 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4590 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4591 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4592 case STBI__F_avg_first : cur[k] = raw[k]; break;
4593 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4599 cur[img_n] = 255; // first pixel
4603 } else if (depth == 16) {
4604 if (img_n != out_n) {
4605 cur[filter_bytes] = 255; // first pixel top byte
4606 cur[filter_bytes+1] = 255; // first pixel bottom byte
4608 raw += filter_bytes;
4609 cur += output_bytes;
4610 prior += output_bytes;
4617 // this is a little gross, so that we don't switch per-pixel or per-component
4618 if (depth < 8 || img_n == out_n) {
4619 int nk = (width - 1)*filter_bytes;
4620 #define STBI__CASE(f) \
4622 for (k=0; k < nk; ++k)
4624 // "none" filter turns into a memcpy here; make that explicit.
4625 case STBI__F_none: memcpy(cur, raw, nk); break;
4626 STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); } break;
4627 STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
4628 STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); } break;
4629 STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); } break;
4630 STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); } break;
4631 STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); } break;
4636 STBI_ASSERT(img_n+1 == out_n);
4637 #define STBI__CASE(f) \
4639 for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4640 for (k=0; k < filter_bytes; ++k)
4642 STBI__CASE(STBI__F_none) { cur[k] = raw[k]; } break;
4643 STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); } break;
4644 STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
4645 STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); } break;
4646 STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); } break;
4647 STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); } break;
4648 STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); } break;
4652 // the loop above sets the high byte of the pixels' alpha, but for
4653 // 16 bit png files we also need the low byte set. we'll do that here.
4655 cur = a->out + stride*j; // start at the beginning of the row again
4656 for (i=0; i < x; ++i,cur+=output_bytes) {
4657 cur[filter_bytes+1] = 255;
4663 // we make a separate pass to expand bits to pixels; for performance,
4664 // this could run two scanlines behind the above code, so it won't
4665 // intefere with filtering but will still be in the cache.
4667 for (j=0; j < y; ++j) {
4668 stbi_uc *cur = a->out + stride*j;
4669 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4670 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4671 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4672 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4674 // note that the final byte might overshoot and write more data than desired.
4675 // we can allocate enough data that this never writes out of memory, but it
4676 // could also overwrite the next scanline. can it overwrite non-empty data
4677 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4678 // so we need to explicitly clamp the final ones
4681 for (k=x*img_n; k >= 2; k-=2, ++in) {
4682 *cur++ = scale * ((*in >> 4) );
4683 *cur++ = scale * ((*in ) & 0x0f);
4685 if (k > 0) *cur++ = scale * ((*in >> 4) );
4686 } else if (depth == 2) {
4687 for (k=x*img_n; k >= 4; k-=4, ++in) {
4688 *cur++ = scale * ((*in >> 6) );
4689 *cur++ = scale * ((*in >> 4) & 0x03);
4690 *cur++ = scale * ((*in >> 2) & 0x03);
4691 *cur++ = scale * ((*in ) & 0x03);
4693 if (k > 0) *cur++ = scale * ((*in >> 6) );
4694 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4695 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4696 } else if (depth == 1) {
4697 for (k=x*img_n; k >= 8; k-=8, ++in) {
4698 *cur++ = scale * ((*in >> 7) );
4699 *cur++ = scale * ((*in >> 6) & 0x01);
4700 *cur++ = scale * ((*in >> 5) & 0x01);
4701 *cur++ = scale * ((*in >> 4) & 0x01);
4702 *cur++ = scale * ((*in >> 3) & 0x01);
4703 *cur++ = scale * ((*in >> 2) & 0x01);
4704 *cur++ = scale * ((*in >> 1) & 0x01);
4705 *cur++ = scale * ((*in ) & 0x01);
4707 if (k > 0) *cur++ = scale * ((*in >> 7) );
4708 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4709 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4710 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4711 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4712 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4713 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4715 if (img_n != out_n) {
4717 // insert alpha = 255
4718 cur = a->out + stride*j;
4720 for (q=x-1; q >= 0; --q) {
4722 cur[q*2+0] = cur[q];
4725 STBI_ASSERT(img_n == 3);
4726 for (q=x-1; q >= 0; --q) {
4728 cur[q*4+2] = cur[q*3+2];
4729 cur[q*4+1] = cur[q*3+1];
4730 cur[q*4+0] = cur[q*3+0];
4735 } else if (depth == 16) {
4736 // force the image data from big-endian to platform-native.
4737 // this is done in a separate pass due to the decoding relying
4738 // on the data being untouched, but could probably be done
4739 // per-line during decode if care is taken.
4740 stbi_uc *cur = a->out;
4741 stbi__uint16 *cur16 = (stbi__uint16*)cur;
4743 for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
4744 *cur16 = (cur[0] << 8) | cur[1];
4751 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4753 int bytes = (depth == 16 ? 2 : 1);
4754 int out_bytes = out_n * bytes;
4758 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4761 final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
4762 for (p=0; p < 7; ++p) {
4763 int xorig[] = { 0,4,0,2,0,1,0 };
4764 int yorig[] = { 0,0,4,0,2,0,1 };
4765 int xspc[] = { 8,8,4,4,2,2,1 };
4766 int yspc[] = { 8,8,8,4,4,2,2 };
4768 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4769 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4770 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4772 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4773 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4777 for (j=0; j < y; ++j) {
4778 for (i=0; i < x; ++i) {
4779 int out_y = j*yspc[p]+yorig[p];
4780 int out_x = i*xspc[p]+xorig[p];
4781 memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes,
4782 a->out + (j*x+i)*out_bytes, out_bytes);
4786 image_data += img_len;
4787 image_data_len -= img_len;
4795 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4797 stbi__context *s = z->s;
4798 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4799 stbi_uc *p = z->out;
4801 // compute color-based transparency, assuming we've
4802 // already got 255 as the alpha value in the output
4803 STBI_ASSERT(out_n == 2 || out_n == 4);
4806 for (i=0; i < pixel_count; ++i) {
4807 p[1] = (p[0] == tc[0] ? 0 : 255);
4811 for (i=0; i < pixel_count; ++i) {
4812 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4820 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
4822 stbi__context *s = z->s;
4823 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4824 stbi__uint16 *p = (stbi__uint16*) z->out;
4826 // compute color-based transparency, assuming we've
4827 // already got 65535 as the alpha value in the output
4828 STBI_ASSERT(out_n == 2 || out_n == 4);
4831 for (i = 0; i < pixel_count; ++i) {
4832 p[1] = (p[0] == tc[0] ? 0 : 65535);
4836 for (i = 0; i < pixel_count; ++i) {
4837 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4845 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4847 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4848 stbi_uc *p, *temp_out, *orig = a->out;
4850 p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0);
4851 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4853 // between here and free(out) below, exitting would leak
4856 if (pal_img_n == 3) {
4857 for (i=0; i < pixel_count; ++i) {
4860 p[1] = palette[n+1];
4861 p[2] = palette[n+2];
4865 for (i=0; i < pixel_count; ++i) {
4868 p[1] = palette[n+1];
4869 p[2] = palette[n+2];
4870 p[3] = palette[n+3];
4882 static int stbi__unpremultiply_on_load = 0;
4883 static int stbi__de_iphone_flag = 0;
4885 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4887 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4890 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4892 stbi__de_iphone_flag = flag_true_if_should_convert;
4895 static void stbi__de_iphone(stbi__png *z)
4897 stbi__context *s = z->s;
4898 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4899 stbi_uc *p = z->out;
4901 if (s->img_out_n == 3) { // convert bgr to rgb
4902 for (i=0; i < pixel_count; ++i) {
4909 STBI_ASSERT(s->img_out_n == 4);
4910 if (stbi__unpremultiply_on_load) {
4911 // convert bgr to rgb and unpremultiply
4912 for (i=0; i < pixel_count; ++i) {
4916 stbi_uc half = a / 2;
4917 p[0] = (p[2] * 255 + half) / a;
4918 p[1] = (p[1] * 255 + half) / a;
4919 p[2] = ( t * 255 + half) / a;
4927 // convert bgr to rgb
4928 for (i=0; i < pixel_count; ++i) {
4938 #define STBI__PNG_TYPE(a,b,c,d) (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
4940 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4942 stbi_uc palette[1024], pal_img_n=0;
4943 stbi_uc has_trans=0, tc[3]={0};
4944 stbi__uint16 tc16[3];
4945 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4946 int first=1,k,interlace=0, color=0, is_iphone=0;
4947 stbi__context *s = z->s;
4953 if (!stbi__check_png_header(s)) return 0;
4955 if (scan == STBI__SCAN_type) return 1;
4958 stbi__pngchunk c = stbi__get_chunk_header(s);
4960 case STBI__PNG_TYPE('C','g','B','I'):
4962 stbi__skip(s, c.length);
4964 case STBI__PNG_TYPE('I','H','D','R'): {
4966 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4968 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4969 s->img_x = stbi__get32be(s);
4970 s->img_y = stbi__get32be(s);
4971 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
4972 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
4973 z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
4974 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4975 if (color == 3 && z->depth == 16) return stbi__err("bad ctype","Corrupt PNG");
4976 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4977 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4978 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4979 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4980 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4982 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4983 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4984 if (scan == STBI__SCAN_header) return 1;
4986 // if paletted, then pal_n is our final components, and
4987 // img_n is # components to decompress/filter.
4989 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4990 // if SCAN_header, have to scan to see if we have a tRNS
4995 case STBI__PNG_TYPE('P','L','T','E'): {
4996 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4997 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4998 pal_len = c.length / 3;
4999 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
5000 for (i=0; i < pal_len; ++i) {
5001 palette[i*4+0] = stbi__get8(s);
5002 palette[i*4+1] = stbi__get8(s);
5003 palette[i*4+2] = stbi__get8(s);
5004 palette[i*4+3] = 255;
5009 case STBI__PNG_TYPE('t','R','N','S'): {
5010 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5011 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
5013 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
5014 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
5015 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
5017 for (i=0; i < c.length; ++i)
5018 palette[i*4+3] = stbi__get8(s);
5020 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
5021 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
5023 if (z->depth == 16) {
5024 for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
5026 for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
5032 case STBI__PNG_TYPE('I','D','A','T'): {
5033 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5034 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
5035 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
5036 if ((int)(ioff + c.length) < (int)ioff) return 0;
5037 if (ioff + c.length > idata_limit) {
5038 stbi__uint32 idata_limit_old = idata_limit;
5040 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
5041 while (ioff + c.length > idata_limit)
5043 STBI_NOTUSED(idata_limit_old);
5044 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
5047 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
5052 case STBI__PNG_TYPE('I','E','N','D'): {
5053 stbi__uint32 raw_len, bpl;
5054 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5055 if (scan != STBI__SCAN_load) return 1;
5056 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
5057 // initial guess for decoded data size to avoid unnecessary reallocs
5058 bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
5059 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
5060 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
5061 if (z->expanded == NULL) return 0; // zlib should set error
5062 STBI_FREE(z->idata); z->idata = NULL;
5063 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
5064 s->img_out_n = s->img_n+1;
5066 s->img_out_n = s->img_n;
5067 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
5069 if (z->depth == 16) {
5070 if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
5072 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
5075 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
5078 // pal_img_n == 3 or 4
5079 s->img_n = pal_img_n; // record the actual colors we had
5080 s->img_out_n = pal_img_n;
5081 if (req_comp >= 3) s->img_out_n = req_comp;
5082 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
5084 } else if (has_trans) {
5085 // non-paletted image with tRNS -> source image has (constant) alpha
5088 STBI_FREE(z->expanded); z->expanded = NULL;
5089 // end of PNG chunk, read and skip CRC
5095 // if critical, fail
5096 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5097 if ((c.type & (1 << 29)) == 0) {
5098 #ifndef STBI_NO_FAILURE_STRINGS
5100 static char invalid_chunk[] = "XXXX PNG chunk not known";
5101 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
5102 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
5103 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
5104 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
5106 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
5108 stbi__skip(s, c.length);
5111 // end of PNG chunk, read and skip CRC
5116 static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri)
5119 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
5120 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
5122 ri->bits_per_channel = 8;
5123 else if (p->depth == 16)
5124 ri->bits_per_channel = 16;
5126 return stbi__errpuc("bad bits_per_channel", "PNG not supported: unsupported color depth");
5129 if (req_comp && req_comp != p->s->img_out_n) {
5130 if (ri->bits_per_channel == 8)
5131 result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
5133 result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
5134 p->s->img_out_n = req_comp;
5135 if (result == NULL) return result;
5139 if (n) *n = p->s->img_n;
5141 STBI_FREE(p->out); p->out = NULL;
5142 STBI_FREE(p->expanded); p->expanded = NULL;
5143 STBI_FREE(p->idata); p->idata = NULL;
5148 static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
5152 return stbi__do_png(&p, x,y,comp,req_comp, ri);
5155 static int stbi__png_test(stbi__context *s)
5158 r = stbi__check_png_header(s);
5163 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
5165 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
5166 stbi__rewind( p->s );
5169 if (x) *x = p->s->img_x;
5170 if (y) *y = p->s->img_y;
5171 if (comp) *comp = p->s->img_n;
5175 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
5179 return stbi__png_info_raw(&p, x, y, comp);
5182 static int stbi__png_is16(stbi__context *s)
5186 if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
5188 if (p.depth != 16) {
5196 // Microsoft/Windows BMP image
5199 static int stbi__bmp_test_raw(stbi__context *s)
5203 if (stbi__get8(s) != 'B') return 0;
5204 if (stbi__get8(s) != 'M') return 0;
5205 stbi__get32le(s); // discard filesize
5206 stbi__get16le(s); // discard reserved
5207 stbi__get16le(s); // discard reserved
5208 stbi__get32le(s); // discard data offset
5209 sz = stbi__get32le(s);
5210 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
5214 static int stbi__bmp_test(stbi__context *s)
5216 int r = stbi__bmp_test_raw(s);
5222 // returns 0..31 for the highest set bit
5223 static int stbi__high_bit(unsigned int z)
5226 if (z == 0) return -1;
5227 if (z >= 0x10000) { n += 16; z >>= 16; }
5228 if (z >= 0x00100) { n += 8; z >>= 8; }
5229 if (z >= 0x00010) { n += 4; z >>= 4; }
5230 if (z >= 0x00004) { n += 2; z >>= 2; }
5231 if (z >= 0x00002) { n += 1;/* >>= 1;*/ }
5235 static int stbi__bitcount(unsigned int a)
5237 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
5238 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
5239 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
5240 a = (a + (a >> 8)); // max 16 per 8 bits
5241 a = (a + (a >> 16)); // max 32 per 8 bits
5245 // extract an arbitrarily-aligned N-bit value (N=bits)
5246 // from v, and then make it 8-bits long and fractionally
5247 // extend it to full full range.
5248 static int stbi__shiftsigned(unsigned int v, int shift, int bits)
5250 static unsigned int mul_table[9] = {
5252 0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
5253 0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
5255 static unsigned int shift_table[9] = {
5262 STBI_ASSERT(v < 256);
5264 STBI_ASSERT(bits >= 0 && bits <= 8);
5265 return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits];
5270 int bpp, offset, hsz;
5271 unsigned int mr,mg,mb,ma, all_a;
5275 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
5278 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
5279 stbi__get32le(s); // discard filesize
5280 stbi__get16le(s); // discard reserved
5281 stbi__get16le(s); // discard reserved
5282 info->offset = stbi__get32le(s);
5283 info->hsz = hsz = stbi__get32le(s);
5284 info->mr = info->mg = info->mb = info->ma = 0;
5285 info->extra_read = 14;
5287 if (info->offset < 0) return stbi__errpuc("bad BMP", "bad BMP");
5289 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
5291 s->img_x = stbi__get16le(s);
5292 s->img_y = stbi__get16le(s);
5294 s->img_x = stbi__get32le(s);
5295 s->img_y = stbi__get32le(s);
5297 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
5298 info->bpp = stbi__get16le(s);
5300 int compress = stbi__get32le(s);
5301 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
5302 stbi__get32le(s); // discard sizeof
5303 stbi__get32le(s); // discard hres
5304 stbi__get32le(s); // discard vres
5305 stbi__get32le(s); // discard colorsused
5306 stbi__get32le(s); // discard max important
5307 if (hsz == 40 || hsz == 56) {
5314 if (info->bpp == 16 || info->bpp == 32) {
5315 if (compress == 0) {
5316 if (info->bpp == 32) {
5317 info->mr = 0xffu << 16;
5318 info->mg = 0xffu << 8;
5319 info->mb = 0xffu << 0;
5320 info->ma = 0xffu << 24;
5321 info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
5323 info->mr = 31u << 10;
5324 info->mg = 31u << 5;
5325 info->mb = 31u << 0;
5327 } else if (compress == 3) {
5328 info->mr = stbi__get32le(s);
5329 info->mg = stbi__get32le(s);
5330 info->mb = stbi__get32le(s);
5331 info->extra_read += 12;
5332 // not documented, but generated by photoshop and handled by mspaint
5333 if (info->mr == info->mg && info->mg == info->mb) {
5335 return stbi__errpuc("bad BMP", "bad BMP");
5338 return stbi__errpuc("bad BMP", "bad BMP");
5342 if (hsz != 108 && hsz != 124)
5343 return stbi__errpuc("bad BMP", "bad BMP");
5344 info->mr = stbi__get32le(s);
5345 info->mg = stbi__get32le(s);
5346 info->mb = stbi__get32le(s);
5347 info->ma = stbi__get32le(s);
5348 stbi__get32le(s); // discard color space
5349 for (i=0; i < 12; ++i)
5350 stbi__get32le(s); // discard color space parameters
5352 stbi__get32le(s); // discard rendering intent
5353 stbi__get32le(s); // discard offset of profile data
5354 stbi__get32le(s); // discard size of profile data
5355 stbi__get32le(s); // discard reserved
5363 static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
5366 unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
5367 stbi_uc pal[256][4];
5368 int psize=0,i,j,width;
5369 int flip_vertically, pad, target;
5370 stbi__bmp_data info;
5374 if (stbi__bmp_parse_header(s, &info) == NULL)
5375 return NULL; // error code already set
5377 flip_vertically = ((int) s->img_y) > 0;
5378 s->img_y = abs((int) s->img_y);
5380 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5381 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5389 if (info.hsz == 12) {
5391 psize = (info.offset - info.extra_read - 24) / 3;
5394 psize = (info.offset - info.extra_read - info.hsz) >> 2;
5397 STBI_ASSERT(info.offset == s->callback_already_read + (int) (s->img_buffer - s->img_buffer_original));
5398 if (info.offset != s->callback_already_read + (s->img_buffer - s->buffer_start)) {
5399 return stbi__errpuc("bad offset", "Corrupt BMP");
5403 if (info.bpp == 24 && ma == 0xff000000)
5406 s->img_n = ma ? 4 : 3;
5407 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
5410 target = s->img_n; // if they want monochrome, we'll post-convert
5412 // sanity-check size
5413 if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
5414 return stbi__errpuc("too large", "Corrupt BMP");
5416 out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
5417 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5418 if (info.bpp < 16) {
5420 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
5421 for (i=0; i < psize; ++i) {
5422 pal[i][2] = stbi__get8(s);
5423 pal[i][1] = stbi__get8(s);
5424 pal[i][0] = stbi__get8(s);
5425 if (info.hsz != 12) stbi__get8(s);
5428 stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
5429 if (info.bpp == 1) width = (s->img_x + 7) >> 3;
5430 else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
5431 else if (info.bpp == 8) width = s->img_x;
5432 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
5434 if (info.bpp == 1) {
5435 for (j=0; j < (int) s->img_y; ++j) {
5436 int bit_offset = 7, v = stbi__get8(s);
5437 for (i=0; i < (int) s->img_x; ++i) {
5438 int color = (v>>bit_offset)&0x1;
5439 out[z++] = pal[color][0];
5440 out[z++] = pal[color][1];
5441 out[z++] = pal[color][2];
5442 if (target == 4) out[z++] = 255;
5443 if (i+1 == (int) s->img_x) break;
5444 if((--bit_offset) < 0) {
5452 for (j=0; j < (int) s->img_y; ++j) {
5453 for (i=0; i < (int) s->img_x; i += 2) {
5454 int v=stbi__get8(s),v2=0;
5455 if (info.bpp == 4) {
5459 out[z++] = pal[v][0];
5460 out[z++] = pal[v][1];
5461 out[z++] = pal[v][2];
5462 if (target == 4) out[z++] = 255;
5463 if (i+1 == (int) s->img_x) break;
5464 v = (info.bpp == 8) ? stbi__get8(s) : v2;
5465 out[z++] = pal[v][0];
5466 out[z++] = pal[v][1];
5467 out[z++] = pal[v][2];
5468 if (target == 4) out[z++] = 255;
5474 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
5477 stbi__skip(s, info.offset - info.extra_read - info.hsz);
5478 if (info.bpp == 24) width = 3 * s->img_x;
5479 else if (info.bpp == 16) width = 2*s->img_x;
5480 else /* bpp = 32 and pad = 0 */ width=0;
5482 if (info.bpp == 24) {
5484 } else if (info.bpp == 32) {
5485 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
5489 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
5490 // right shift amt to put high bit in position #7
5491 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
5492 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
5493 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
5494 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
5495 if (rcount > 8 || gcount > 8 || bcount > 8 || acount > 8) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
5497 for (j=0; j < (int) s->img_y; ++j) {
5499 for (i=0; i < (int) s->img_x; ++i) {
5501 out[z+2] = stbi__get8(s);
5502 out[z+1] = stbi__get8(s);
5503 out[z+0] = stbi__get8(s);
5505 a = (easy == 2 ? stbi__get8(s) : 255);
5507 if (target == 4) out[z++] = a;
5511 for (i=0; i < (int) s->img_x; ++i) {
5512 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
5514 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
5515 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
5516 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
5517 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
5519 if (target == 4) out[z++] = STBI__BYTECAST(a);
5526 // if alpha channel is all 0s, replace with all 255s
5527 if (target == 4 && all_a == 0)
5528 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
5531 if (flip_vertically) {
5533 for (j=0; j < (int) s->img_y>>1; ++j) {
5534 stbi_uc *p1 = out + j *s->img_x*target;
5535 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
5536 for (i=0; i < (int) s->img_x*target; ++i) {
5537 t = p1[i]; p1[i] = p2[i]; p2[i] = t;
5542 if (req_comp && req_comp != target) {
5543 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
5544 if (out == NULL) return out; // stbi__convert_format frees input on failure
5549 if (comp) *comp = s->img_n;
5554 // Targa Truevision - TGA
5555 // by Jonathan Dummer
5557 // returns STBI_rgb or whatever, 0 on error
5558 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
5560 // only RGB or RGBA (incl. 16bit) or grey allowed
5561 if (is_rgb16) *is_rgb16 = 0;
5562 switch(bits_per_pixel) {
5563 case 8: return STBI_grey;
5564 case 16: if(is_grey) return STBI_grey_alpha;
5566 case 15: if(is_rgb16) *is_rgb16 = 1;
5568 case 24: // fallthrough
5569 case 32: return bits_per_pixel/8;
5574 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
5576 int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
5577 int sz, tga_colormap_type;
5578 stbi__get8(s); // discard Offset
5579 tga_colormap_type = stbi__get8(s); // colormap type
5580 if( tga_colormap_type > 1 ) {
5582 return 0; // only RGB or indexed allowed
5584 tga_image_type = stbi__get8(s); // image type
5585 if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
5586 if (tga_image_type != 1 && tga_image_type != 9) {
5590 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5591 sz = stbi__get8(s); // check bits per palette color entry
5592 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
5596 stbi__skip(s,4); // skip image x and y origin
5597 tga_colormap_bpp = sz;
5598 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5599 if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
5601 return 0; // only RGB or grey allowed, +/- RLE
5603 stbi__skip(s,9); // skip colormap specification and image x/y origin
5604 tga_colormap_bpp = 0;
5606 tga_w = stbi__get16le(s);
5609 return 0; // test width
5611 tga_h = stbi__get16le(s);
5614 return 0; // test height
5616 tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5617 stbi__get8(s); // ignore alpha bits
5618 if (tga_colormap_bpp != 0) {
5619 if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5620 // when using a colormap, tga_bits_per_pixel is the size of the indexes
5621 // I don't think anything but 8 or 16bit indexes makes sense
5625 tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5627 tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5635 if (comp) *comp = tga_comp;
5636 return 1; // seems to have passed everything
5639 static int stbi__tga_test(stbi__context *s)
5642 int sz, tga_color_type;
5643 stbi__get8(s); // discard Offset
5644 tga_color_type = stbi__get8(s); // color type
5645 if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
5646 sz = stbi__get8(s); // image type
5647 if ( tga_color_type == 1 ) { // colormapped (paletted) image
5648 if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5649 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5650 sz = stbi__get8(s); // check bits per palette color entry
5651 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5652 stbi__skip(s,4); // skip image x and y origin
5653 } else { // "normal" image w/o colormap
5654 if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
5655 stbi__skip(s,9); // skip colormap specification and image x/y origin
5657 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
5658 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
5659 sz = stbi__get8(s); // bits per pixel
5660 if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
5661 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5663 res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5670 // read 16bit value and convert to 24bit RGB
5671 static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
5673 stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
5674 stbi__uint16 fiveBitMask = 31;
5675 // we have 3 channels with 5bits each
5676 int r = (px >> 10) & fiveBitMask;
5677 int g = (px >> 5) & fiveBitMask;
5678 int b = px & fiveBitMask;
5679 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5680 out[0] = (stbi_uc)((r * 255)/31);
5681 out[1] = (stbi_uc)((g * 255)/31);
5682 out[2] = (stbi_uc)((b * 255)/31);
5684 // some people claim that the most significant bit might be used for alpha
5685 // (possibly if an alpha-bit is set in the "image descriptor byte")
5686 // but that only made 16bit test images completely translucent..
5687 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5690 static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
5692 // read in the TGA header stuff
5693 int tga_offset = stbi__get8(s);
5694 int tga_indexed = stbi__get8(s);
5695 int tga_image_type = stbi__get8(s);
5697 int tga_palette_start = stbi__get16le(s);
5698 int tga_palette_len = stbi__get16le(s);
5699 int tga_palette_bits = stbi__get8(s);
5700 int tga_x_origin = stbi__get16le(s);
5701 int tga_y_origin = stbi__get16le(s);
5702 int tga_width = stbi__get16le(s);
5703 int tga_height = stbi__get16le(s);
5704 int tga_bits_per_pixel = stbi__get8(s);
5705 int tga_comp, tga_rgb16=0;
5706 int tga_inverted = stbi__get8(s);
5707 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5709 unsigned char *tga_data;
5710 unsigned char *tga_palette = NULL;
5712 unsigned char raw_data[4] = {0};
5714 int RLE_repeating = 0;
5715 int read_next_pixel = 1;
5717 STBI_NOTUSED(tga_x_origin); // @TODO
5718 STBI_NOTUSED(tga_y_origin); // @TODO
5720 if (tga_height > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5721 if (tga_width > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5723 // do a tiny bit of precessing
5724 if ( tga_image_type >= 8 )
5726 tga_image_type -= 8;
5729 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5731 // If I'm paletted, then I'll use the number of bits from the palette
5732 if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5733 else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5735 if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5736 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5741 if (comp) *comp = tga_comp;
5743 if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
5744 return stbi__errpuc("too large", "Corrupt TGA");
5746 tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
5747 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5749 // skip to the data's starting position (offset usually = 0)
5750 stbi__skip(s, tga_offset );
5752 if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5753 for (i=0; i < tga_height; ++i) {
5754 int row = tga_inverted ? tga_height -i - 1 : i;
5755 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5756 stbi__getn(s, tga_row, tga_width * tga_comp);
5759 // do I need to load a palette?
5762 if (tga_palette_len == 0) { /* you have to have at least one entry! */
5763 STBI_FREE(tga_data);
5764 return stbi__errpuc("bad palette", "Corrupt TGA");
5767 // any data to skip? (offset usually = 0)
5768 stbi__skip(s, tga_palette_start );
5770 tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
5772 STBI_FREE(tga_data);
5773 return stbi__errpuc("outofmem", "Out of memory");
5776 stbi_uc *pal_entry = tga_palette;
5777 STBI_ASSERT(tga_comp == STBI_rgb);
5778 for (i=0; i < tga_palette_len; ++i) {
5779 stbi__tga_read_rgb16(s, pal_entry);
5780 pal_entry += tga_comp;
5782 } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5783 STBI_FREE(tga_data);
5784 STBI_FREE(tga_palette);
5785 return stbi__errpuc("bad palette", "Corrupt TGA");
5789 for (i=0; i < tga_width * tga_height; ++i)
5791 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5794 if ( RLE_count == 0 )
5796 // yep, get the next byte as a RLE command
5797 int RLE_cmd = stbi__get8(s);
5798 RLE_count = 1 + (RLE_cmd & 127);
5799 RLE_repeating = RLE_cmd >> 7;
5800 read_next_pixel = 1;
5801 } else if ( !RLE_repeating )
5803 read_next_pixel = 1;
5807 read_next_pixel = 1;
5809 // OK, if I need to read a pixel, do it now
5810 if ( read_next_pixel )
5812 // load however much data we did have
5815 // read in index, then perform the lookup
5816 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5817 if ( pal_idx >= tga_palette_len ) {
5821 pal_idx *= tga_comp;
5822 for (j = 0; j < tga_comp; ++j) {
5823 raw_data[j] = tga_palette[pal_idx+j];
5825 } else if(tga_rgb16) {
5826 STBI_ASSERT(tga_comp == STBI_rgb);
5827 stbi__tga_read_rgb16(s, raw_data);
5829 // read in the data raw
5830 for (j = 0; j < tga_comp; ++j) {
5831 raw_data[j] = stbi__get8(s);
5834 // clear the reading flag for the next pixel
5835 read_next_pixel = 0;
5836 } // end of reading a pixel
5839 for (j = 0; j < tga_comp; ++j)
5840 tga_data[i*tga_comp+j] = raw_data[j];
5842 // in case we're in RLE mode, keep counting down
5845 // do I need to invert the image?
5848 for (j = 0; j*2 < tga_height; ++j)
5850 int index1 = j * tga_width * tga_comp;
5851 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5852 for (i = tga_width * tga_comp; i > 0; --i)
5854 unsigned char temp = tga_data[index1];
5855 tga_data[index1] = tga_data[index2];
5856 tga_data[index2] = temp;
5862 // clear my palette, if I had one
5863 if ( tga_palette != NULL )
5865 STBI_FREE( tga_palette );
5869 // swap RGB - if the source data was RGB16, it already is in the right order
5870 if (tga_comp >= 3 && !tga_rgb16)
5872 unsigned char* tga_pixel = tga_data;
5873 for (i=0; i < tga_width * tga_height; ++i)
5875 unsigned char temp = tga_pixel[0];
5876 tga_pixel[0] = tga_pixel[2];
5877 tga_pixel[2] = temp;
5878 tga_pixel += tga_comp;
5882 // convert to target component count
5883 if (req_comp && req_comp != tga_comp)
5884 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5886 // the things I do to get rid of an error message, and yet keep
5887 // Microsoft's C compilers happy... [8^(
5888 tga_palette_start = tga_palette_len = tga_palette_bits =
5889 tga_x_origin = tga_y_origin = 0;
5890 STBI_NOTUSED(tga_palette_start);
5896 // *************************************************************************************************
5897 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5900 static int stbi__psd_test(stbi__context *s)
5902 int r = (stbi__get32be(s) == 0x38425053);
5907 static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount)
5909 int count, nleft, len;
5912 while ((nleft = pixelCount - count) > 0) {
5913 len = stbi__get8(s);
5916 } else if (len < 128) {
5917 // Copy next len+1 bytes literally.
5919 if (len > nleft) return 0; // corrupt data
5926 } else if (len > 128) {
5928 // Next -len+1 bytes in the dest are replicated from next source byte.
5929 // (Interpret len as a negative 8-bit int.)
5931 if (len > nleft) return 0; // corrupt data
5932 val = stbi__get8(s);
5945 static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
5948 int channelCount, compression;
5956 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5957 return stbi__errpuc("not PSD", "Corrupt PSD image");
5959 // Check file type version.
5960 if (stbi__get16be(s) != 1)
5961 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5963 // Skip 6 reserved bytes.
5966 // Read the number of channels (R, G, B, A, etc).
5967 channelCount = stbi__get16be(s);
5968 if (channelCount < 0 || channelCount > 16)
5969 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5971 // Read the rows and columns of the image.
5972 h = stbi__get32be(s);
5973 w = stbi__get32be(s);
5975 if (h > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5976 if (w > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5978 // Make sure the depth is 8 bits.
5979 bitdepth = stbi__get16be(s);
5980 if (bitdepth != 8 && bitdepth != 16)
5981 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5983 // Make sure the color mode is RGB.
5984 // Valid options are:
5993 if (stbi__get16be(s) != 3)
5994 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5996 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5997 stbi__skip(s,stbi__get32be(s) );
5999 // Skip the image resources. (resolution, pen tool paths, etc)
6000 stbi__skip(s, stbi__get32be(s) );
6002 // Skip the reserved data.
6003 stbi__skip(s, stbi__get32be(s) );
6005 // Find out if the data is compressed.
6007 // 0: no compression
6008 // 1: RLE compressed
6009 compression = stbi__get16be(s);
6010 if (compression > 1)
6011 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
6014 if (!stbi__mad3sizes_valid(4, w, h, 0))
6015 return stbi__errpuc("too large", "Corrupt PSD");
6017 // Create the destination image.
6019 if (!compression && bitdepth == 16 && bpc == 16) {
6020 out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0);
6021 ri->bits_per_channel = 16;
6023 out = (stbi_uc *) stbi__malloc(4 * w*h);
6025 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6028 // Initialize the data to zero.
6029 //memset( out, 0, pixelCount * 4 );
6031 // Finally, the image data.
6033 // RLE as used by .PSD and .TIFF
6034 // Loop until you get the number of unpacked bytes you are expecting:
6035 // Read the next source byte into n.
6036 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
6037 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
6038 // Else if n is 128, noop.
6041 // The RLE-compressed data is preceded by a 2-byte data count for each row in the data,
6042 // which we're going to just skip.
6043 stbi__skip(s, h * channelCount * 2 );
6045 // Read the RLE data by channel.
6046 for (channel = 0; channel < 4; channel++) {
6050 if (channel >= channelCount) {
6051 // Fill this channel with default data.
6052 for (i = 0; i < pixelCount; i++, p += 4)
6053 *p = (channel == 3 ? 255 : 0);
6055 // Read the RLE data.
6056 if (!stbi__psd_decode_rle(s, p, pixelCount)) {
6058 return stbi__errpuc("corrupt", "bad RLE data");
6064 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
6065 // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
6067 // Read the data by channel.
6068 for (channel = 0; channel < 4; channel++) {
6069 if (channel >= channelCount) {
6070 // Fill this channel with default data.
6071 if (bitdepth == 16 && bpc == 16) {
6072 stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
6073 stbi__uint16 val = channel == 3 ? 65535 : 0;
6074 for (i = 0; i < pixelCount; i++, q += 4)
6077 stbi_uc *p = out+channel;
6078 stbi_uc val = channel == 3 ? 255 : 0;
6079 for (i = 0; i < pixelCount; i++, p += 4)
6083 if (ri->bits_per_channel == 16) { // output bpc
6084 stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
6085 for (i = 0; i < pixelCount; i++, q += 4)
6086 *q = (stbi__uint16) stbi__get16be(s);
6088 stbi_uc *p = out+channel;
6089 if (bitdepth == 16) { // input bpc
6090 for (i = 0; i < pixelCount; i++, p += 4)
6091 *p = (stbi_uc) (stbi__get16be(s) >> 8);
6093 for (i = 0; i < pixelCount; i++, p += 4)
6101 // remove weird white matte from PSD
6102 if (channelCount >= 4) {
6103 if (ri->bits_per_channel == 16) {
6104 for (i=0; i < w*h; ++i) {
6105 stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i;
6106 if (pixel[3] != 0 && pixel[3] != 65535) {
6107 float a = pixel[3] / 65535.0f;
6108 float ra = 1.0f / a;
6109 float inv_a = 65535.0f * (1 - ra);
6110 pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a);
6111 pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a);
6112 pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a);
6116 for (i=0; i < w*h; ++i) {
6117 unsigned char *pixel = out + 4*i;
6118 if (pixel[3] != 0 && pixel[3] != 255) {
6119 float a = pixel[3] / 255.0f;
6120 float ra = 1.0f / a;
6121 float inv_a = 255.0f * (1 - ra);
6122 pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
6123 pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
6124 pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
6130 // convert to desired output format
6131 if (req_comp && req_comp != 4) {
6132 if (ri->bits_per_channel == 16)
6133 out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h);
6135 out = stbi__convert_format(out, 4, req_comp, w, h);
6136 if (out == NULL) return out; // stbi__convert_format frees input on failure
6139 if (comp) *comp = 4;
6147 // *************************************************************************************************
6148 // Softimage PIC loader
6151 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
6152 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
6155 static int stbi__pic_is4(stbi__context *s,const char *str)
6159 if (stbi__get8(s) != (stbi_uc)str[i])
6165 static int stbi__pic_test_core(stbi__context *s)
6169 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
6175 if (!stbi__pic_is4(s,"PICT"))
6183 stbi_uc size,type,channel;
6186 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
6190 for (i=0; i<4; ++i, mask>>=1) {
6191 if (channel & mask) {
6192 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
6193 dest[i]=stbi__get8(s);
6200 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
6204 for (i=0;i<4; ++i, mask>>=1)
6209 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
6211 int act_comp=0,num_packets=0,y,chained;
6212 stbi__pic_packet packets[10];
6214 // this will (should...) cater for even some bizarre stuff like having data
6215 // for the same channel in multiple packets.
6217 stbi__pic_packet *packet;
6219 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6220 return stbi__errpuc("bad format","too many packets");
6222 packet = &packets[num_packets++];
6224 chained = stbi__get8(s);
6225 packet->size = stbi__get8(s);
6226 packet->type = stbi__get8(s);
6227 packet->channel = stbi__get8(s);
6229 act_comp |= packet->channel;
6231 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
6232 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
6235 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
6237 for(y=0; y<height; ++y) {
6240 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
6241 stbi__pic_packet *packet = &packets[packet_idx];
6242 stbi_uc *dest = result+y*width*4;
6244 switch (packet->type) {
6246 return stbi__errpuc("bad format","packet has bad compression type");
6248 case 0: {//uncompressed
6251 for(x=0;x<width;++x, dest+=4)
6252 if (!stbi__readval(s,packet->channel,dest))
6262 stbi_uc count,value[4];
6264 count=stbi__get8(s);
6265 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
6268 count = (stbi_uc) left;
6270 if (!stbi__readval(s,packet->channel,value)) return 0;
6272 for(i=0; i<count; ++i,dest+=4)
6273 stbi__copyval(packet->channel,dest,value);
6279 case 2: {//Mixed RLE
6282 int count = stbi__get8(s), i;
6283 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
6285 if (count >= 128) { // Repeated
6289 count = stbi__get16be(s);
6293 return stbi__errpuc("bad file","scanline overrun");
6295 if (!stbi__readval(s,packet->channel,value))
6298 for(i=0;i<count;++i, dest += 4)
6299 stbi__copyval(packet->channel,dest,value);
6302 if (count>left) return stbi__errpuc("bad file","scanline overrun");
6304 for(i=0;i<count;++i, dest+=4)
6305 if (!stbi__readval(s,packet->channel,dest))
6319 static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri)
6322 int i, x,y, internal_comp;
6325 if (!comp) comp = &internal_comp;
6327 for (i=0; i<92; ++i)
6330 x = stbi__get16be(s);
6331 y = stbi__get16be(s);
6333 if (y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
6334 if (x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
6336 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
6337 if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
6339 stbi__get32be(s); //skip `ratio'
6340 stbi__get16be(s); //skip `fields'
6341 stbi__get16be(s); //skip `pad'
6343 // intermediate buffer is RGBA
6344 result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0);
6345 memset(result, 0xff, x*y*4);
6347 if (!stbi__pic_load_core(s,x,y,comp, result)) {
6353 if (req_comp == 0) req_comp = *comp;
6354 result=stbi__convert_format(result,4,req_comp,x,y);
6359 static int stbi__pic_test(stbi__context *s)
6361 int r = stbi__pic_test_core(s);
6367 // *************************************************************************************************
6368 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
6381 stbi_uc *out; // output buffer (always 4 components)
6382 stbi_uc *background; // The current "background" as far as a gif is concerned
6384 int flags, bgindex, ratio, transparent, eflags;
6385 stbi_uc pal[256][4];
6386 stbi_uc lpal[256][4];
6387 stbi__gif_lzw codes[8192];
6388 stbi_uc *color_table;
6391 int start_x, start_y;
6398 static int stbi__gif_test_raw(stbi__context *s)
6401 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
6403 if (sz != '9' && sz != '7') return 0;
6404 if (stbi__get8(s) != 'a') return 0;
6408 static int stbi__gif_test(stbi__context *s)
6410 int r = stbi__gif_test_raw(s);
6415 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
6418 for (i=0; i < num_entries; ++i) {
6419 pal[i][2] = stbi__get8(s);
6420 pal[i][1] = stbi__get8(s);
6421 pal[i][0] = stbi__get8(s);
6422 pal[i][3] = transp == i ? 0 : 255;
6426 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
6429 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
6430 return stbi__err("not GIF", "Corrupt GIF");
6432 version = stbi__get8(s);
6433 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
6434 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
6436 stbi__g_failure_reason = "";
6437 g->w = stbi__get16le(s);
6438 g->h = stbi__get16le(s);
6439 g->flags = stbi__get8(s);
6440 g->bgindex = stbi__get8(s);
6441 g->ratio = stbi__get8(s);
6442 g->transparent = -1;
6444 if (g->w > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
6445 if (g->h > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
6447 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
6449 if (is_info) return 1;
6451 if (g->flags & 0x80)
6452 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
6457 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
6459 stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
6460 if (!stbi__gif_header(s, g, comp, 1)) {
6471 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
6476 // recurse to decode the prefixes, since the linked-list is backwards,
6477 // and working backwards through an interleaved image would be nasty
6478 if (g->codes[code].prefix >= 0)
6479 stbi__out_gif_code(g, g->codes[code].prefix);
6481 if (g->cur_y >= g->max_y) return;
6483 idx = g->cur_x + g->cur_y;
6485 g->history[idx / 4] = 1;
6487 c = &g->color_table[g->codes[code].suffix * 4];
6488 if (c[3] > 128) { // don't render transparent pixels;
6496 if (g->cur_x >= g->max_x) {
6497 g->cur_x = g->start_x;
6498 g->cur_y += g->step;
6500 while (g->cur_y >= g->max_y && g->parse > 0) {
6501 g->step = (1 << g->parse) * g->line_size;
6502 g->cur_y = g->start_y + (g->step >> 1);
6508 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
6511 stbi__int32 len, init_code;
6513 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
6516 lzw_cs = stbi__get8(s);
6517 if (lzw_cs > 12) return NULL;
6518 clear = 1 << lzw_cs;
6520 codesize = lzw_cs + 1;
6521 codemask = (1 << codesize) - 1;
6524 for (init_code = 0; init_code < clear; init_code++) {
6525 g->codes[init_code].prefix = -1;
6526 g->codes[init_code].first = (stbi_uc) init_code;
6527 g->codes[init_code].suffix = (stbi_uc) init_code;
6530 // support no starting clear code
6536 if (valid_bits < codesize) {
6538 len = stbi__get8(s); // start new block
6543 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
6546 stbi__int32 code = bits & codemask;
6548 valid_bits -= codesize;
6549 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
6550 if (code == clear) { // clear code
6551 codesize = lzw_cs + 1;
6552 codemask = (1 << codesize) - 1;
6556 } else if (code == clear + 1) { // end of stream code
6558 while ((len = stbi__get8(s)) > 0)
6561 } else if (code <= avail) {
6563 return stbi__errpuc("no clear code", "Corrupt GIF");
6567 p = &g->codes[avail++];
6569 return stbi__errpuc("too many codes", "Corrupt GIF");
6572 p->prefix = (stbi__int16) oldcode;
6573 p->first = g->codes[oldcode].first;
6574 p->suffix = (code == avail) ? p->first : g->codes[code].first;
6575 } else if (code == avail)
6576 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
6578 stbi__out_gif_code(g, (stbi__uint16) code);
6580 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
6582 codemask = (1 << codesize) - 1;
6587 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
6593 // this function is designed to support animated gifs, although stb_image doesn't support it
6594 // two back is the image from two frames ago, used for a very specific disposal format
6595 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back)
6601 STBI_NOTUSED(req_comp);
6603 // on first frame, any non-written pixels get the background colour (non-transparent)
6606 if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
6607 if (!stbi__mad3sizes_valid(4, g->w, g->h, 0))
6608 return stbi__errpuc("too large", "GIF image is too large");
6609 pcount = g->w * g->h;
6610 g->out = (stbi_uc *) stbi__malloc(4 * pcount);
6611 g->background = (stbi_uc *) stbi__malloc(4 * pcount);
6612 g->history = (stbi_uc *) stbi__malloc(pcount);
6613 if (!g->out || !g->background || !g->history)
6614 return stbi__errpuc("outofmem", "Out of memory");
6616 // image is treated as "transparent" at the start - ie, nothing overwrites the current background;
6617 // background colour is only used for pixels that are not rendered first frame, after that "background"
6618 // color refers to the color that was there the previous frame.
6619 memset(g->out, 0x00, 4 * pcount);
6620 memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent)
6621 memset(g->history, 0x00, pcount); // pixels that were affected previous frame
6624 // second frame - how do we dispose of the previous one?
6625 dispose = (g->eflags & 0x1C) >> 2;
6626 pcount = g->w * g->h;
6628 if ((dispose == 3) && (two_back == 0)) {
6629 dispose = 2; // if I don't have an image to revert back to, default to the old background
6632 if (dispose == 3) { // use previous graphic
6633 for (pi = 0; pi < pcount; ++pi) {
6634 if (g->history[pi]) {
6635 memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 );
6638 } else if (dispose == 2) {
6639 // restore what was changed last frame to background before that frame;
6640 for (pi = 0; pi < pcount; ++pi) {
6641 if (g->history[pi]) {
6642 memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 );
6646 // This is a non-disposal case eithe way, so just
6647 // leave the pixels as is, and they will become the new background
6648 // 1: do not dispose
6649 // 0: not specified.
6652 // background is what out is after the undoing of the previou frame;
6653 memcpy( g->background, g->out, 4 * g->w * g->h );
6656 // clear my history;
6657 memset( g->history, 0x00, g->w * g->h ); // pixels that were affected previous frame
6660 int tag = stbi__get8(s);
6662 case 0x2C: /* Image Descriptor */
6664 stbi__int32 x, y, w, h;
6667 x = stbi__get16le(s);
6668 y = stbi__get16le(s);
6669 w = stbi__get16le(s);
6670 h = stbi__get16le(s);
6671 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
6672 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
6674 g->line_size = g->w * 4;
6676 g->start_y = y * g->line_size;
6677 g->max_x = g->start_x + w * 4;
6678 g->max_y = g->start_y + h * g->line_size;
6679 g->cur_x = g->start_x;
6680 g->cur_y = g->start_y;
6682 // if the width of the specified rectangle is 0, that means
6683 // we may not see *any* pixels or the image is malformed;
6684 // to make sure this is caught, move the current y down to
6685 // max_y (which is what out_gif_code checks).
6687 g->cur_y = g->max_y;
6689 g->lflags = stbi__get8(s);
6691 if (g->lflags & 0x40) {
6692 g->step = 8 * g->line_size; // first interlaced spacing
6695 g->step = g->line_size;
6699 if (g->lflags & 0x80) {
6700 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
6701 g->color_table = (stbi_uc *) g->lpal;
6702 } else if (g->flags & 0x80) {
6703 g->color_table = (stbi_uc *) g->pal;
6705 return stbi__errpuc("missing color table", "Corrupt GIF");
6707 o = stbi__process_gif_raster(s, g);
6708 if (!o) return NULL;
6710 // if this was the first frame,
6711 pcount = g->w * g->h;
6712 if (first_frame && (g->bgindex > 0)) {
6713 // if first frame, any pixel not drawn to gets the background color
6714 for (pi = 0; pi < pcount; ++pi) {
6715 if (g->history[pi] == 0) {
6716 g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
6717 memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 );
6725 case 0x21: // Comment Extension.
6728 int ext = stbi__get8(s);
6729 if (ext == 0xF9) { // Graphic Control Extension.
6730 len = stbi__get8(s);
6732 g->eflags = stbi__get8(s);
6733 g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
6735 // unset old transparent
6736 if (g->transparent >= 0) {
6737 g->pal[g->transparent][3] = 255;
6739 if (g->eflags & 0x01) {
6740 g->transparent = stbi__get8(s);
6741 if (g->transparent >= 0) {
6742 g->pal[g->transparent][3] = 0;
6745 // don't need transparent
6747 g->transparent = -1;
6754 while ((len = stbi__get8(s)) != 0) {
6760 case 0x3B: // gif stream termination code
6761 return (stbi_uc *) s; // using '1' causes warning on some compilers
6764 return stbi__errpuc("unknown code", "Corrupt GIF");
6769 static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
6771 if (stbi__gif_test(s)) {
6775 stbi_uc *two_back = 0;
6779 int delays_size = 0;
6780 memset(&g, 0, sizeof(g));
6786 u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
6787 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
6793 stride = g.w * g.h * 4;
6796 void *tmp = (stbi_uc*) STBI_REALLOC_SIZED( out, out_size, layers * stride );
6799 STBI_FREE(g.history);
6800 STBI_FREE(g.background);
6801 return stbi__errpuc("outofmem", "Out of memory");
6804 out = (stbi_uc*) tmp;
6805 out_size = layers * stride;
6809 *delays = (int*) STBI_REALLOC_SIZED( *delays, delays_size, sizeof(int) * layers );
6810 delays_size = layers * sizeof(int);
6813 out = (stbi_uc*)stbi__malloc( layers * stride );
6814 out_size = layers * stride;
6816 *delays = (int*) stbi__malloc( layers * sizeof(int) );
6817 delays_size = layers * sizeof(int);
6820 memcpy( out + ((layers - 1) * stride), u, stride );
6822 two_back = out - 2 * stride;
6826 (*delays)[layers - 1U] = g.delay;
6831 // free temp buffer;
6833 STBI_FREE(g.history);
6834 STBI_FREE(g.background);
6836 // do the final conversion after loading everything;
6837 if (req_comp && req_comp != 4)
6838 out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
6843 return stbi__errpuc("not GIF", "Image was not as a gif type.");
6847 static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
6851 memset(&g, 0, sizeof(g));
6854 u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
6855 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
6860 // moved conversion to after successful load so that the same
6861 // can be done for multiple frames.
6862 if (req_comp && req_comp != 4)
6863 u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
6865 // if there was an error and we allocated an image buffer, free it!
6869 // free buffers needed for multiple frame loading;
6870 STBI_FREE(g.history);
6871 STBI_FREE(g.background);
6876 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
6878 return stbi__gif_info_raw(s,x,y,comp);
6882 // *************************************************************************************************
6883 // Radiance RGBE HDR loader
6884 // originally by Nicolas Schulz
6886 static int stbi__hdr_test_core(stbi__context *s, const char *signature)
6889 for (i=0; signature[i]; ++i)
6890 if (stbi__get8(s) != signature[i])
6896 static int stbi__hdr_test(stbi__context* s)
6898 int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
6901 r = stbi__hdr_test_core(s, "#?RGBE\n");
6907 #define STBI__HDR_BUFLEN 1024
6908 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
6913 c = (char) stbi__get8(z);
6915 while (!stbi__at_eof(z) && c != '\n') {
6917 if (len == STBI__HDR_BUFLEN-1) {
6918 // flush to end of line
6919 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
6923 c = (char) stbi__get8(z);
6930 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
6932 if ( input[3] != 0 ) {
6935 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
6937 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
6939 output[0] = input[0] * f1;
6940 output[1] = input[1] * f1;
6941 output[2] = input[2] * f1;
6943 if (req_comp == 2) output[1] = 1;
6944 if (req_comp == 4) output[3] = 1;
6947 case 4: output[3] = 1; /* fallthrough */
6948 case 3: output[0] = output[1] = output[2] = 0;
6950 case 2: output[1] = 1; /* fallthrough */
6951 case 1: output[0] = 0;
6957 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
6959 char buffer[STBI__HDR_BUFLEN];
6966 unsigned char count, value;
6967 int i, j, k, c1,c2, z;
6968 const char *headerToken;
6972 headerToken = stbi__hdr_gettoken(s,buffer);
6973 if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
6974 return stbi__errpf("not HDR", "Corrupt HDR image");
6978 token = stbi__hdr_gettoken(s,buffer);
6979 if (token[0] == 0) break;
6980 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6983 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6985 // Parse width and height
6986 // can't use sscanf() if we're not using stdio!
6987 token = stbi__hdr_gettoken(s,buffer);
6988 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6990 height = (int) strtol(token, &token, 10);
6991 while (*token == ' ') ++token;
6992 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6994 width = (int) strtol(token, NULL, 10);
6996 if (height > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
6997 if (width > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
7002 if (comp) *comp = 3;
7003 if (req_comp == 0) req_comp = 3;
7005 if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
7006 return stbi__errpf("too large", "HDR image is too large");
7009 hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
7011 return stbi__errpf("outofmem", "Out of memory");
7014 // image data is stored as some number of sca
7015 if ( width < 8 || width >= 32768) {
7017 for (j=0; j < height; ++j) {
7018 for (i=0; i < width; ++i) {
7021 stbi__getn(s, rgbe, 4);
7022 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
7026 // Read RLE-encoded data
7029 for (j = 0; j < height; ++j) {
7032 len = stbi__get8(s);
7033 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
7034 // not run-length encoded, so we have to actually use THIS data as a decoded
7035 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
7037 rgbe[0] = (stbi_uc) c1;
7038 rgbe[1] = (stbi_uc) c2;
7039 rgbe[2] = (stbi_uc) len;
7040 rgbe[3] = (stbi_uc) stbi__get8(s);
7041 stbi__hdr_convert(hdr_data, rgbe, req_comp);
7044 STBI_FREE(scanline);
7045 goto main_decode_loop; // yes, this makes no sense
7048 len |= stbi__get8(s);
7049 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
7050 if (scanline == NULL) {
7051 scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0);
7053 STBI_FREE(hdr_data);
7054 return stbi__errpf("outofmem", "Out of memory");
7058 for (k = 0; k < 4; ++k) {
7061 while ((nleft = width - i) > 0) {
7062 count = stbi__get8(s);
7065 value = stbi__get8(s);
7067 if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
7068 for (z = 0; z < count; ++z)
7069 scanline[i++ * 4 + k] = value;
7072 if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
7073 for (z = 0; z < count; ++z)
7074 scanline[i++ * 4 + k] = stbi__get8(s);
7078 for (i=0; i < width; ++i)
7079 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
7082 STBI_FREE(scanline);
7088 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
7090 char buffer[STBI__HDR_BUFLEN];
7097 if (!comp) comp = &dummy;
7099 if (stbi__hdr_test(s) == 0) {
7105 token = stbi__hdr_gettoken(s,buffer);
7106 if (token[0] == 0) break;
7107 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
7114 token = stbi__hdr_gettoken(s,buffer);
7115 if (strncmp(token, "-Y ", 3)) {
7120 *y = (int) strtol(token, &token, 10);
7121 while (*token == ' ') ++token;
7122 if (strncmp(token, "+X ", 3)) {
7127 *x = (int) strtol(token, NULL, 10);
7131 #endif // STBI_NO_HDR
7134 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
7137 stbi__bmp_data info;
7140 p = stbi__bmp_parse_header(s, &info);
7144 if (x) *x = s->img_x;
7145 if (y) *y = s->img_y;
7147 if (info.bpp == 24 && info.ma == 0xff000000)
7150 *comp = info.ma ? 4 : 3;
7157 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
7159 int channelCount, dummy, depth;
7162 if (!comp) comp = &dummy;
7163 if (stbi__get32be(s) != 0x38425053) {
7167 if (stbi__get16be(s) != 1) {
7172 channelCount = stbi__get16be(s);
7173 if (channelCount < 0 || channelCount > 16) {
7177 *y = stbi__get32be(s);
7178 *x = stbi__get32be(s);
7179 depth = stbi__get16be(s);
7180 if (depth != 8 && depth != 16) {
7184 if (stbi__get16be(s) != 3) {
7192 static int stbi__psd_is16(stbi__context *s)
7194 int channelCount, depth;
7195 if (stbi__get32be(s) != 0x38425053) {
7199 if (stbi__get16be(s) != 1) {
7204 channelCount = stbi__get16be(s);
7205 if (channelCount < 0 || channelCount > 16) {
7209 (void) stbi__get32be(s);
7210 (void) stbi__get32be(s);
7211 depth = stbi__get16be(s);
7221 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
7223 int act_comp=0,num_packets=0,chained,dummy;
7224 stbi__pic_packet packets[10];
7228 if (!comp) comp = &dummy;
7230 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
7237 *x = stbi__get16be(s);
7238 *y = stbi__get16be(s);
7239 if (stbi__at_eof(s)) {
7243 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
7251 stbi__pic_packet *packet;
7253 if (num_packets==sizeof(packets)/sizeof(packets[0]))
7256 packet = &packets[num_packets++];
7257 chained = stbi__get8(s);
7258 packet->size = stbi__get8(s);
7259 packet->type = stbi__get8(s);
7260 packet->channel = stbi__get8(s);
7261 act_comp |= packet->channel;
7263 if (stbi__at_eof(s)) {
7267 if (packet->size != 8) {
7273 *comp = (act_comp & 0x10 ? 4 : 3);
7279 // *************************************************************************************************
7280 // Portable Gray Map and Portable Pixel Map loader
7283 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
7284 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
7286 // Known limitations:
7287 // Does not support comments in the header section
7288 // Does not support ASCII image data (formats P2 and P3)
7289 // Does not support 16-bit-per-channel
7293 static int stbi__pnm_test(stbi__context *s)
7296 p = (char) stbi__get8(s);
7297 t = (char) stbi__get8(s);
7298 if (p != 'P' || (t != '5' && t != '6')) {
7305 static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
7310 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
7313 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
7314 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
7318 if (comp) *comp = s->img_n;
7320 if (!stbi__mad3sizes_valid(s->img_n, s->img_x, s->img_y, 0))
7321 return stbi__errpuc("too large", "PNM too large");
7323 out = (stbi_uc *) stbi__malloc_mad3(s->img_n, s->img_x, s->img_y, 0);
7324 if (!out) return stbi__errpuc("outofmem", "Out of memory");
7325 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
7327 if (req_comp && req_comp != s->img_n) {
7328 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
7329 if (out == NULL) return out; // stbi__convert_format frees input on failure
7334 static int stbi__pnm_isspace(char c)
7336 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
7339 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
7342 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
7343 *c = (char) stbi__get8(s);
7345 if (stbi__at_eof(s) || *c != '#')
7348 while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
7349 *c = (char) stbi__get8(s);
7353 static int stbi__pnm_isdigit(char c)
7355 return c >= '0' && c <= '9';
7358 static int stbi__pnm_getinteger(stbi__context *s, char *c)
7362 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
7363 value = value*10 + (*c - '0');
7364 *c = (char) stbi__get8(s);
7370 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
7377 if (!comp) comp = &dummy;
7382 p = (char) stbi__get8(s);
7383 t = (char) stbi__get8(s);
7384 if (p != 'P' || (t != '5' && t != '6')) {
7389 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
7391 c = (char) stbi__get8(s);
7392 stbi__pnm_skip_whitespace(s, &c);
7394 *x = stbi__pnm_getinteger(s, &c); // read width
7395 stbi__pnm_skip_whitespace(s, &c);
7397 *y = stbi__pnm_getinteger(s, &c); // read height
7398 stbi__pnm_skip_whitespace(s, &c);
7400 maxv = stbi__pnm_getinteger(s, &c); // read max value
7403 return stbi__err("max value > 255", "PPM image not 8-bit");
7409 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
7411 #ifndef STBI_NO_JPEG
7412 if (stbi__jpeg_info(s, x, y, comp)) return 1;
7416 if (stbi__png_info(s, x, y, comp)) return 1;
7420 if (stbi__gif_info(s, x, y, comp)) return 1;
7424 if (stbi__bmp_info(s, x, y, comp)) return 1;
7428 if (stbi__psd_info(s, x, y, comp)) return 1;
7432 if (stbi__pic_info(s, x, y, comp)) return 1;
7436 if (stbi__pnm_info(s, x, y, comp)) return 1;
7440 if (stbi__hdr_info(s, x, y, comp)) return 1;
7443 // test tga last because it's a crappy test!
7445 if (stbi__tga_info(s, x, y, comp))
7448 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
7451 static int stbi__is_16_main(stbi__context *s)
7454 if (stbi__png_is16(s)) return 1;
7458 if (stbi__psd_is16(s)) return 1;
7464 #ifndef STBI_NO_STDIO
7465 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
7467 FILE *f = stbi__fopen(filename, "rb");
7469 if (!f) return stbi__err("can't fopen", "Unable to open file");
7470 result = stbi_info_from_file(f, x, y, comp);
7475 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
7479 long pos = ftell(f);
7480 stbi__start_file(&s, f);
7481 r = stbi__info_main(&s,x,y,comp);
7482 fseek(f,pos,SEEK_SET);
7486 STBIDEF int stbi_is_16_bit(char const *filename)
7488 FILE *f = stbi__fopen(filename, "rb");
7490 if (!f) return stbi__err("can't fopen", "Unable to open file");
7491 result = stbi_is_16_bit_from_file(f);
7496 STBIDEF int stbi_is_16_bit_from_file(FILE *f)
7500 long pos = ftell(f);
7501 stbi__start_file(&s, f);
7502 r = stbi__is_16_main(&s);
7503 fseek(f,pos,SEEK_SET);
7506 #endif // !STBI_NO_STDIO
7508 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
7511 stbi__start_mem(&s,buffer,len);
7512 return stbi__info_main(&s,x,y,comp);
7515 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
7518 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
7519 return stbi__info_main(&s,x,y,comp);
7522 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len)
7525 stbi__start_mem(&s,buffer,len);
7526 return stbi__is_16_main(&s);
7529 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user)
7532 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
7533 return stbi__is_16_main(&s);
7536 #endif // STB_IMAGE_IMPLEMENTATION
7540 2.20 (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
7541 2.19 (2018-02-11) fix warning
7542 2.18 (2018-01-30) fix warnings
7543 2.17 (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
7547 2.16 (2017-07-23) all functions have 16-bit variants;
7548 STBI_NO_STDIO works again;
7550 fix rounding in unpremultiply;
7551 optimize vertical flip;
7552 disable raw_len validation;
7554 2.15 (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
7555 warning fixes; disable run-time SSE detection on gcc;
7556 uniform handling of optional "return" values;
7557 thread-safe initialization of zlib tables
7558 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
7559 2.13 (2016-11-29) add 16-bit API, only supported for PNG right now
7560 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
7561 2.11 (2016-04-02) allocate large structures on the stack
7562 remove white matting for transparent PSD
7563 fix reported channel count for PNG & BMP
7564 re-enable SSE2 in non-gcc 64-bit
7565 support RGB-formatted JPEG
7566 read 16-bit PNGs (only as 8-bit)
7567 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
7568 2.09 (2016-01-16) allow comments in PNM files
7569 16-bit-per-pixel TGA (not bit-per-component)
7570 info() for TGA could break due to .hdr handling
7571 info() for BMP to shares code instead of sloppy parse
7572 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
7574 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
7575 2.07 (2015-09-13) fix compiler warnings
7576 partial animated GIF support
7577 limited 16-bpc PSD support
7578 #ifdef unused functions
7579 bug with < 92 byte PIC,PNM,HDR,TGA
7580 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
7581 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
7582 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
7583 2.03 (2015-04-12) extra corruption checking (mmozeiko)
7584 stbi_set_flip_vertically_on_load (nguillemot)
7585 fix NEON support; fix mingw support
7586 2.02 (2015-01-19) fix incorrect assert, fix warning
7587 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
7588 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
7589 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
7590 progressive JPEG (stb)
7591 PGM/PPM support (Ken Miller)
7592 STBI_MALLOC,STBI_REALLOC,STBI_FREE
7593 GIF bugfix -- seemingly never worked
7594 STBI_NO_*, STBI_ONLY_*
7595 1.48 (2014-12-14) fix incorrectly-named assert()
7596 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
7598 fix bug in interlaced PNG with user-specified channel count (stb)
7600 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
7602 fix MSVC-ARM internal compiler error by wrapping malloc
7604 various warning fixes from Ronny Chevalier
7606 fix MSVC-only compiler problem in code changed in 1.42
7608 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
7609 fixes to stbi__cleanup_jpeg path
7610 added STBI_ASSERT to avoid requiring assert.h
7612 fix search&replace from 1.36 that messed up comments/error messages
7614 fix gcc struct-initialization warning
7616 fix to TGA optimization when req_comp != number of components in TGA;
7617 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
7618 add support for BMP version 5 (more ignored fields)
7620 suppress MSVC warnings on integer casts truncating values
7621 fix accidental rename of 'skip' field of I/O
7623 remove duplicate typedef
7625 convert to header file single-file library
7626 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
7629 fix broken STBI_SIMD path
7630 fix bug where stbi_load_from_file no longer left file pointer in correct place
7631 fix broken non-easy path for 32-bit BMP (possibly never used)
7632 TGA optimization by Arseny Kapoulkine
7634 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
7636 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
7638 support for "info" function for all supported filetypes (SpartanJ)
7640 a few more leak fixes, bug in PNG handling (SpartanJ)
7642 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
7643 removed deprecated format-specific test/load functions
7644 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
7645 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
7646 fix inefficiency in decoding 32-bit BMP (David Woo)
7648 various warning fixes from Aurelien Pocheville
7650 fix bug in GIF palette transparency (SpartanJ)
7652 cast-to-stbi_uc to fix warnings
7654 fix bug in file buffering for PNG reported by SpartanJ
7656 refix trans_data warning (Won Chun)
7658 perf improvements reading from files on platforms with lock-heavy fgetc()
7659 minor perf improvements for jpeg
7660 deprecated type-specific functions so we'll get feedback if they're needed
7661 attempt to fix trans_data warning (Won Chun)
7662 1.23 fixed bug in iPhone support
7664 removed image *writing* support
7665 stbi_info support from Jetro Lauha
7666 GIF support from Jean-Marc Lienher
7667 iPhone PNG-extensions from James Brown
7668 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
7669 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
7670 1.20 added support for Softimage PIC, by Tom Seddon
7671 1.19 bug in interlaced PNG corruption check (found by ryg)
7673 fix a threading bug (local mutable static)
7674 1.17 support interlaced PNG
7675 1.16 major bugfix - stbi__convert_format converted one too many pixels
7676 1.15 initialize some fields for thread safety
7677 1.14 fix threadsafe conversion bug
7678 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
7680 1.12 const qualifiers in the API
7681 1.11 Support installable IDCT, colorspace conversion routines
7682 1.10 Fixes for 64-bit (don't use "unsigned long")
7683 optimized upsampling by Fabian "ryg" Giesen
7684 1.09 Fix format-conversion for PSD code (bad global variables!)
7685 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
7686 1.07 attempt to fix C++ warning/errors again
7687 1.06 attempt to fix C++ warning/errors again
7688 1.05 fix TGA loading to return correct *comp and use good luminance calc
7689 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
7690 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
7691 1.02 support for (subset of) HDR files, float interface for preferred access to them
7692 1.01 fix bug: possible bug in handling right-side up bmps... not sure
7693 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
7694 1.00 interface to zlib that skips zlib header
7695 0.99 correct handling of alpha in palette
7696 0.98 TGA loader by lonesock; dynamically add loaders (untested)
7697 0.97 jpeg errors on too large a file; also catch another malloc failure
7698 0.96 fix detection of invalid v value - particleman@mollyrocket forum
7699 0.95 during header scan, seek to markers in case of padding
7700 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
7701 0.93 handle jpegtran output; verbose errors
7702 0.92 read 4,8,16,24,32-bit BMP files of several formats
7703 0.91 output 24-bit Windows 3.0 BMP files
7704 0.90 fix a few more warnings; bump version number to approach 1.0
7705 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
7706 0.60 fix compiling as c++
7707 0.59 fix warnings: merge Dave Moore's -Wall fixes
7708 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
7709 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
7710 0.56 fix bug: zlib uncompressed mode len vs. nlen
7711 0.55 fix bug: restart_interval not initialized to 0
7712 0.54 allow NULL for 'int *comp'
7713 0.53 fix bug in png 3->4; speedup png decoding
7714 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
7715 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
7716 on 'test' only check type, not whether we support this variant
7718 first released version
7723 ------------------------------------------------------------------------------
7724 This software is available under 2 licenses -- choose whichever you prefer.
7725 ------------------------------------------------------------------------------
7726 ALTERNATIVE A - MIT License
7727 Copyright (c) 2017 Sean Barrett
7728 Permission is hereby granted, free of charge, to any person obtaining a copy of
7729 this software and associated documentation files (the "Software"), to deal in
7730 the Software without restriction, including without limitation the rights to
7731 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
7732 of the Software, and to permit persons to whom the Software is furnished to do
7733 so, subject to the following conditions:
7734 The above copyright notice and this permission notice shall be included in all
7735 copies or substantial portions of the Software.
7736 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
7737 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
7738 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
7739 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
7740 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
7741 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
7743 ------------------------------------------------------------------------------
7744 ALTERNATIVE B - Public Domain (www.unlicense.org)
7745 This is free and unencumbered software released into the public domain.
7746 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
7747 software, either in source code form or as a compiled binary, for any purpose,
7748 commercial or non-commercial, and by any means.
7749 In jurisdictions that recognize copyright laws, the author or authors of this
7750 software dedicate any and all copyright interest in the software to the public
7751 domain. We make this dedication for the benefit of the public at large and to
7752 the detriment of our heirs and successors. We intend this dedication to be an
7753 overt act of relinquishment in perpetuity of all present and future rights to
7754 this software under copyright law.
7755 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
7756 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
7757 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
7758 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
7759 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
7760 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
7761 ------------------------------------------------------------------------------