1Background
2==========
3
4libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
5AVX2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression
6on x86, x86-64, ARM, and PowerPC systems, as well as progressive JPEG
7compression on x86 and x86-64 systems.  On such systems, libjpeg-turbo is
8generally 2-6x as fast as libjpeg, all else being equal.  On other types of
9systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
10virtue of its highly-optimized Huffman coding routines.  In many cases, the
11performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
12
13libjpeg-turbo implements both the traditional libjpeg API as well as the less
14powerful but more straightforward TurboJPEG API.  libjpeg-turbo also features
15colorspace extensions that allow it to compress from/decompress to 32-bit and
16big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
17interface.
18
19libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
20derivative of libjpeg v6b developed by Miyasaka Masaru.  The TigerVNC and
21VirtualGL projects made numerous enhancements to the codec in 2009, and in
22early 2010, libjpeg-turbo spun off into an independent project, with the goal
23of making high-speed JPEG compression/decompression technology available to a
24broader range of users and developers.
25
26
27License
28=======
29
30libjpeg-turbo is covered by three compatible BSD-style open source licenses.
31Refer to [LICENSE.md](LICENSE.md) for a roll-up of license terms.
32
33
34Building libjpeg-turbo
35======================
36
37Refer to [BUILDING.md](BUILDING.md) for complete instructions.
38
39
40Using libjpeg-turbo
41===================
42
43libjpeg-turbo includes two APIs that can be used to compress and decompress
44JPEG images:
45
46- **TurboJPEG API**<br>
47  This API provides an easy-to-use interface for compressing and decompressing
48  JPEG images in memory.  It also provides some functionality that would not be
49  straightforward to achieve using the underlying libjpeg API, such as
50  generating planar YUV images and performing multiple simultaneous lossless
51  transforms on an image.  The Java interface for libjpeg-turbo is written on
52  top of the TurboJPEG API.  The TurboJPEG API is recommended for first-time
53  users of libjpeg-turbo.  Refer to [tjexample.c](tjexample.c) and
54  [TJExample.java](java/TJExample.java) for examples of its usage and to
55  <http://libjpeg-turbo.org/Documentation/Documentation> for API documentation.
56
57- **libjpeg API**<br>
58  This is the de facto industry-standard API for compressing and decompressing
59  JPEG images.  It is more difficult to use than the TurboJPEG API but also
60  more powerful.  The libjpeg API implementation in libjpeg-turbo is both
61  API/ABI-compatible and mathematically compatible with libjpeg v6b.  It can
62  also optionally be configured to be API/ABI-compatible with libjpeg v7 and v8
63  (see below.)  Refer to [cjpeg.c](cjpeg.c) and [djpeg.c](djpeg.c) for examples
64  of its usage and to [libjpeg.txt](libjpeg.txt) for API documentation.
65
66There is no significant performance advantage to either API when both are used
67to perform similar operations.
68
69Colorspace Extensions
70---------------------
71
72libjpeg-turbo includes extensions that allow JPEG images to be compressed
73directly from (and decompressed directly to) buffers that use BGR, BGRX,
74RGBX, XBGR, and XRGB pixel ordering.  This is implemented with ten new
75colorspace constants:
76
77    JCS_EXT_RGB   /* red/green/blue */
78    JCS_EXT_RGBX  /* red/green/blue/x */
79    JCS_EXT_BGR   /* blue/green/red */
80    JCS_EXT_BGRX  /* blue/green/red/x */
81    JCS_EXT_XBGR  /* x/blue/green/red */
82    JCS_EXT_XRGB  /* x/red/green/blue */
83    JCS_EXT_RGBA  /* red/green/blue/alpha */
84    JCS_EXT_BGRA  /* blue/green/red/alpha */
85    JCS_EXT_ABGR  /* alpha/blue/green/red */
86    JCS_EXT_ARGB  /* alpha/red/green/blue */
87
88Setting `cinfo.in_color_space` (compression) or `cinfo.out_color_space`
89(decompression) to one of these values will cause libjpeg-turbo to read the
90red, green, and blue values from (or write them to) the appropriate position in
91the pixel when compressing from/decompressing to an RGB buffer.
92
93Your application can check for the existence of these extensions at compile
94time with:
95
96    #ifdef JCS_EXTENSIONS
97
98At run time, attempting to use these extensions with a libjpeg implementation
99that does not support them will result in a "Bogus input colorspace" error.
100Applications can trap this error in order to test whether run-time support is
101available for the colorspace extensions.
102
103When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
104X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
105can set that byte to whatever value it wishes.  If an application expects the X
106byte to be used as an alpha channel, then it should specify `JCS_EXT_RGBA`,
107`JCS_EXT_BGRA`, `JCS_EXT_ABGR`, or `JCS_EXT_ARGB`.  When these colorspace
108constants are used, the X byte is guaranteed to be 0xFF, which is interpreted
109as opaque.
110
111Your application can check for the existence of the alpha channel colorspace
112extensions at compile time with:
113
114    #ifdef JCS_ALPHA_EXTENSIONS
115
116[jcstest.c](jcstest.c), located in the libjpeg-turbo source tree, demonstrates
117how to check for the existence of the colorspace extensions at compile time and
118run time.
119
120libjpeg v7 and v8 API/ABI Emulation
121-----------------------------------
122
123With libjpeg v7 and v8, new features were added that necessitated extending the
124compression and decompression structures.  Unfortunately, due to the exposed
125nature of those structures, extending them also necessitated breaking backward
126ABI compatibility with previous libjpeg releases.  Thus, programs that were
127built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
128based on the libjpeg v6b code base.  Although libjpeg v7 and v8 are not
129as widely used as v6b, enough programs (including a few Linux distros) made
130the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
131in libjpeg-turbo.  It should be noted, however, that this feature was added
132primarily so that applications that had already been compiled to use libjpeg
133v7+ could take advantage of accelerated baseline JPEG encoding/decoding
134without recompiling.  libjpeg-turbo does not claim to support all of the
135libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
136cases (see below.)
137
138By passing an argument of `--with-jpeg7` or `--with-jpeg8` to `configure`, or
139an argument of `-DWITH_JPEG7=1` or `-DWITH_JPEG8=1` to `cmake`, you can build a
140version of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so that
141programs that are built against libjpeg v7 or v8 can be run with libjpeg-turbo.
142The following section describes which libjpeg v7+ features are supported and
143which aren't.
144
145### Support for libjpeg v7 and v8 Features
146
147#### Fully supported
148
149- **libjpeg: IDCT scaling extensions in decompressor**<br>
150  libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
151  1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
152  and 1/2 are SIMD-accelerated.)
153
154- **libjpeg: Arithmetic coding**
155
156- **libjpeg: In-memory source and destination managers**<br>
157  See notes below.
158
159- **cjpeg: Separate quality settings for luminance and chrominance**<br>
160  Note that the libpjeg v7+ API was extended to accommodate this feature only
161  for convenience purposes.  It has always been possible to implement this
162  feature with libjpeg v6b (see rdswitch.c for an example.)
163
164- **cjpeg: 32-bit BMP support**
165
166- **cjpeg: `-rgb` option**
167
168- **jpegtran: Lossless cropping**
169
170- **jpegtran: `-perfect` option**
171
172- **jpegtran: Forcing width/height when performing lossless crop**
173
174- **rdjpgcom: `-raw` option**
175
176- **rdjpgcom: Locale awareness**
177
178
179#### Not supported
180
181NOTE:  As of this writing, extensive research has been conducted into the
182usefulness of DCT scaling as a means of data reduction and SmartScale as a
183means of quality improvement.  The reader is invited to peruse the research at
184<http://www.libjpeg-turbo.org/About/SmartScale> and draw his/her own conclusions,
185but it is the general belief of our project that these features have not
186demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
187
188- **libjpeg: DCT scaling in compressor**<br>
189  `cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
190  There is no technical reason why DCT scaling could not be supported when
191  emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
192  below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
193  8/9 would be available, which is of limited usefulness.
194
195- **libjpeg: SmartScale**<br>
196  `cinfo.block_size` is silently ignored.
197  SmartScale is an extension to the JPEG format that allows for DCT block
198  sizes other than 8x8.  Providing support for this new format would be
199  feasible (particularly without full acceleration.)  However, until/unless
200  the format becomes either an official industry standard or, at minimum, an
201  accepted solution in the community, we are hesitant to implement it, as
202  there is no sense of whether or how it might change in the future.  It is
203  our belief that SmartScale has not demonstrated sufficient usefulness as a
204  lossless format nor as a means of quality enhancement, and thus our primary
205  interest in providing this feature would be as a means of supporting
206  additional DCT scaling factors.
207
208- **libjpeg: Fancy downsampling in compressor**<br>
209  `cinfo.do_fancy_downsampling` is silently ignored.
210  This requires the DCT scaling feature, which is not supported.
211
212- **jpegtran: Scaling**<br>
213  This requires both the DCT scaling and SmartScale features, which are not
214  supported.
215
216- **Lossless RGB JPEG files**<br>
217  This requires the SmartScale feature, which is not supported.
218
219### What About libjpeg v9?
220
221libjpeg v9 introduced yet another field to the JPEG compression structure
222(`color_transform`), thus making the ABI backward incompatible with that of
223libjpeg v8.  This new field was introduced solely for the purpose of supporting
224lossless SmartScale encoding.  Furthermore, there was actually no reason to
225extend the API in this manner, as the color transform could have just as easily
226been activated by way of a new JPEG colorspace constant, thus preserving
227backward ABI compatibility.
228
229Our research (see link above) has shown that lossless SmartScale does not
230generally accomplish anything that can't already be accomplished better with
231existing, standard lossless formats.  Therefore, at this time it is our belief
232that there is not sufficient technical justification for software projects to
233upgrade from libjpeg v8 to libjpeg v9, and thus there is not sufficient
234technical justification for us to emulate the libjpeg v9 ABI.
235
236In-Memory Source/Destination Managers
237-------------------------------------
238
239By default, libjpeg-turbo 1.3 and later includes the `jpeg_mem_src()` and
240`jpeg_mem_dest()` functions, even when not emulating the libjpeg v8 API/ABI.
241Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
242API/ABI emulation in order to use the in-memory source/destination managers,
243but several projects requested that those functions be included when emulating
244the libjpeg v6b API/ABI as well.  This allows the use of those functions by
245programs that need them, without breaking ABI compatibility for programs that
246don't, and it allows those functions to be provided in the "official"
247libjpeg-turbo binaries.
248
249Those who are concerned about maintaining strict conformance with the libjpeg
250v6b or v7 API can pass an argument of `--without-mem-srcdst` to `configure` or
251an argument of `-DWITH_MEM_SRCDST=0` to `cmake` prior to building
252libjpeg-turbo.  This will restore the pre-1.3 behavior, in which
253`jpeg_mem_src()` and `jpeg_mem_dest()` are only included when emulating the
254libjpeg v8 API/ABI.
255
256On Un*x systems, including the in-memory source/destination managers changes
257the dynamic library version from 62.1.0 to 62.2.0 if using libjpeg v6b API/ABI
258emulation and from 7.1.0 to 7.2.0 if using libjpeg v7 API/ABI emulation.
259
260Note that, on most Un*x systems, the dynamic linker will not look for a
261function in a library until that function is actually used.  Thus, if a program
262is built against libjpeg-turbo 1.3+ and uses `jpeg_mem_src()` or
263`jpeg_mem_dest()`, that program will not fail if run against an older version
264of libjpeg-turbo or against libjpeg v7- until the program actually tries to
265call `jpeg_mem_src()` or `jpeg_mem_dest()`.  Such is not the case on Windows.
266If a program is built against the libjpeg-turbo 1.3+ DLL and uses
267`jpeg_mem_src()` or `jpeg_mem_dest()`, then it must use the libjpeg-turbo 1.3+
268DLL at run time.
269
270Both cjpeg and djpeg have been extended to allow testing the in-memory
271source/destination manager functions.  See their respective man pages for more
272details.
273
274
275Mathematical Compatibility
276==========================
277
278For the most part, libjpeg-turbo should produce identical output to libjpeg
279v6b.  The one exception to this is when using the floating point DCT/IDCT, in
280which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
281following reasons:
282
283- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
284  slightly more accurate than the implementation in libjpeg v6b, but not by
285  any amount perceptible to human vision (generally in the range of 0.01 to
286  0.08 dB gain in PNSR.)
287
288- When not using the SIMD extensions, libjpeg-turbo uses the more accurate
289  (and slightly faster) floating point IDCT algorithm introduced in libjpeg
290  v8a as opposed to the algorithm used in libjpeg v6b.  It should be noted,
291  however, that this algorithm basically brings the accuracy of the floating
292  point IDCT in line with the accuracy of the slow integer IDCT.  The floating
293  point DCT/IDCT algorithms are mainly a legacy feature, and they do not
294  produce significantly more accuracy than the slow integer algorithms (to put
295  numbers on this, the typical difference in PNSR between the two algorithms
296  is less than 0.10 dB, whereas changing the quality level by 1 in the upper
297  range of the quality scale is typically more like a 1.0 dB difference.)
298
299- If the floating point algorithms in libjpeg-turbo are not implemented using
300  SIMD instructions on a particular platform, then the accuracy of the
301  floating point DCT/IDCT can depend on the compiler settings.
302
303While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood it is
304still using the same algorithms as libjpeg v6b, so there are several specific
305cases in which libjpeg-turbo cannot be expected to produce the same output as
306libjpeg v8:
307
308- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
309  implements those scaling algorithms differently than libjpeg v6b does, and
310  libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
311
312- When using chrominance subsampling, because libjpeg v8 implements this
313  with its DCT/IDCT scaling algorithms rather than with a separate
314  downsampling/upsampling algorithm.  In our testing, the subsampled/upsampled
315  output of libjpeg v8 is less accurate than that of libjpeg v6b for this
316  reason.
317
318- When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
319  "non-smooth") chrominance upsampling, because libjpeg v8 does not support
320  merged upsampling with scaling factors > 1.
321
322
323Performance Pitfalls
324====================
325
326Restart Markers
327---------------
328
329The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
330in a way that makes the rest of the libjpeg infrastructure happy, so it is
331necessary to use the slow Huffman decoder when decompressing a JPEG image that
332has restart markers.  This can cause the decompression performance to drop by
333as much as 20%, but the performance will still be much greater than that of
334libjpeg.  Many consumer packages, such as PhotoShop, use restart markers when
335generating JPEG images, so images generated by those programs will experience
336this issue.
337
338Fast Integer Forward DCT at High Quality Levels
339-----------------------------------------------
340
341The algorithm used by the SIMD-accelerated quantization function cannot produce
342correct results whenever the fast integer forward DCT is used along with a JPEG
343quality of 98-100.  Thus, libjpeg-turbo must use the non-SIMD quantization
344function in those cases.  This causes performance to drop by as much as 40%.
345It is therefore strongly advised that you use the slow integer forward DCT
346whenever encoding images with a JPEG quality of 98 or higher.
347