1*******************************************************************************
2**     Background
3*******************************************************************************
4
5libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
6NEON) to accelerate baseline JPEG compression and decompression on x86, x86-64,
7and ARM systems.  On such systems, libjpeg-turbo is generally 2-4x as fast as
8libjpeg, all else being equal.  On other types of systems, libjpeg-turbo can
9still outperform libjpeg by a significant amount, by virtue of its
10highly-optimized Huffman coding routines.  In many cases, the performance of
11libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
12
13libjpeg-turbo implements both the traditional libjpeg API as well as the less
14powerful but more straightforward TurboJPEG API.  libjpeg-turbo also features
15colorspace extensions that allow it to compress from/decompress to 32-bit and
16big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
17interface.
18
19libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
20derivative of libjpeg v6b developed by Miyasaka Masaru.  The TigerVNC and
21VirtualGL projects made numerous enhancements to the codec in 2009, and in
22early 2010, libjpeg-turbo spun off into an independent project, with the goal
23of making high-speed JPEG compression/decompression technology available to a
24broader range of users and developers.
25
26
27*******************************************************************************
28**     License
29*******************************************************************************
30
31libjpeg-turbo is covered by three compatible BSD-style open source licenses.
32Refer to LICENSE.txt for a roll-up of license terms.
33
34
35*******************************************************************************
36**     Using libjpeg-turbo
37*******************************************************************************
38
39libjpeg-turbo includes two APIs that can be used to compress and decompress
40JPEG images:
41
42  TurboJPEG API:  This API provides an easy-to-use interface for compressing
43  and decompressing JPEG images in memory.  It also provides some functionality
44  that would not be straightforward to achieve using the underlying libjpeg
45  API, such as generating planar YUV images and performing multiple
46  simultaneous lossless transforms on an image.  The Java interface for
47  libjpeg-turbo is written on top of the TurboJPEG API.
48
49  libjpeg API:  This is the de facto industry-standard API for compressing and
50  decompressing JPEG images.  It is more difficult to use than the TurboJPEG
51  API but also more powerful.  The libjpeg API implementation in libjpeg-turbo
52  is both API/ABI-compatible and mathematically compatible with libjpeg v6b.
53  It can also optionally be configured to be API/ABI-compatible with libjpeg v7
54  and v8 (see below.)
55
56There is no significant performance advantage to either API when both are used
57to perform similar operations.
58
59=====================
60Colorspace Extensions
61=====================
62
63libjpeg-turbo includes extensions that allow JPEG images to be compressed
64directly from (and decompressed directly to) buffers that use BGR, BGRX,
65RGBX, XBGR, and XRGB pixel ordering.  This is implemented with ten new
66colorspace constants:
67
68  JCS_EXT_RGB   /* red/green/blue */
69  JCS_EXT_RGBX  /* red/green/blue/x */
70  JCS_EXT_BGR   /* blue/green/red */
71  JCS_EXT_BGRX  /* blue/green/red/x */
72  JCS_EXT_XBGR  /* x/blue/green/red */
73  JCS_EXT_XRGB  /* x/red/green/blue */
74  JCS_EXT_RGBA  /* red/green/blue/alpha */
75  JCS_EXT_BGRA  /* blue/green/red/alpha */
76  JCS_EXT_ABGR  /* alpha/blue/green/red */
77  JCS_EXT_ARGB  /* alpha/red/green/blue */
78
79Setting cinfo.in_color_space (compression) or cinfo.out_color_space
80(decompression) to one of these values will cause libjpeg-turbo to read the
81red, green, and blue values from (or write them to) the appropriate position in
82the pixel when compressing from/decompressing to an RGB buffer.
83
84Your application can check for the existence of these extensions at compile
85time with:
86
87  #ifdef JCS_EXTENSIONS
88
89At run time, attempting to use these extensions with a libjpeg implementation
90that does not support them will result in a "Bogus input colorspace" error.
91Applications can trap this error in order to test whether run-time support is
92available for the colorspace extensions.
93
94When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
95X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
96can set that byte to whatever value it wishes.  If an application expects the X
97byte to be used as an alpha channel, then it should specify JCS_EXT_RGBA,
98JCS_EXT_BGRA, JCS_EXT_ABGR, or JCS_EXT_ARGB.  When these colorspace constants
99are used, the X byte is guaranteed to be 0xFF, which is interpreted as opaque.
100
101Your application can check for the existence of the alpha channel colorspace
102extensions at compile time with:
103
104  #ifdef JCS_ALPHA_EXTENSIONS
105
106jcstest.c, located in the libjpeg-turbo source tree, demonstrates how to check
107for the existence of the colorspace extensions at compile time and run time.
108
109===================================
110libjpeg v7 and v8 API/ABI Emulation
111===================================
112
113With libjpeg v7 and v8, new features were added that necessitated extending the
114compression and decompression structures.  Unfortunately, due to the exposed
115nature of those structures, extending them also necessitated breaking backward
116ABI compatibility with previous libjpeg releases.  Thus, programs that were
117built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
118based on the libjpeg v6b code base.  Although libjpeg v7 and v8 are not
119as widely used as v6b, enough programs (including a few Linux distros) made
120the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
121in libjpeg-turbo.  It should be noted, however, that this feature was added
122primarily so that applications that had already been compiled to use libjpeg
123v7+ could take advantage of accelerated baseline JPEG encoding/decoding
124without recompiling.  libjpeg-turbo does not claim to support all of the
125libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
126cases (see below.)
127
128By passing an argument of --with-jpeg7 or --with-jpeg8 to configure, or an
129argument of -DWITH_JPEG7=1 or -DWITH_JPEG8=1 to cmake, you can build a version
130of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so that programs
131that are built against libjpeg v7 or v8 can be run with libjpeg-turbo.  The
132following section describes which libjpeg v7+ features are supported and which
133aren't.
134
135Support for libjpeg v7 and v8 Features:
136---------------------------------------
137
138Fully supported:
139
140-- libjpeg: IDCT scaling extensions in decompressor
141   libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
142   1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
143   and 1/2 are SIMD-accelerated.)
144
145-- libjpeg: arithmetic coding
146
147-- libjpeg: In-memory source and destination managers
148   See notes below.
149
150-- cjpeg: Separate quality settings for luminance and chrominance
151   Note that the libpjeg v7+ API was extended to accommodate this feature only
152   for convenience purposes.  It has always been possible to implement this
153   feature with libjpeg v6b (see rdswitch.c for an example.)
154
155-- cjpeg: 32-bit BMP support
156
157-- cjpeg: -rgb option
158
159-- jpegtran: lossless cropping
160
161-- jpegtran: -perfect option
162
163-- jpegtran: forcing width/height when performing lossless crop
164
165-- rdjpgcom: -raw option
166
167-- rdjpgcom: locale awareness
168
169
170Not supported:
171
172NOTE:  As of this writing, extensive research has been conducted into the
173usefulness of DCT scaling as a means of data reduction and SmartScale as a
174means of quality improvement.  The reader is invited to peruse the research at
175http://www.libjpeg-turbo.org/About/SmartScale and draw his/her own conclusions,
176but it is the general belief of our project that these features have not
177demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
178
179-- libjpeg: DCT scaling in compressor
180   cinfo.scale_num and cinfo.scale_denom are silently ignored.
181   There is no technical reason why DCT scaling could not be supported when
182   emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
183   below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
184   8/9 would be available, which is of limited usefulness.
185
186-- libjpeg: SmartScale
187   cinfo.block_size is silently ignored.
188   SmartScale is an extension to the JPEG format that allows for DCT block
189   sizes other than 8x8.  Providing support for this new format would be
190   feasible (particularly without full acceleration.)  However, until/unless
191   the format becomes either an official industry standard or, at minimum, an
192   accepted solution in the community, we are hesitant to implement it, as
193   there is no sense of whether or how it might change in the future.  It is
194   our belief that SmartScale has not demonstrated sufficient usefulness as a
195   lossless format nor as a means of quality enhancement, and thus, our primary
196   interest in providing this feature would be as a means of supporting
197   additional DCT scaling factors.
198
199-- libjpeg: Fancy downsampling in compressor
200   cinfo.do_fancy_downsampling is silently ignored.
201   This requires the DCT scaling feature, which is not supported.
202
203-- jpegtran: Scaling
204   This requires both the DCT scaling and SmartScale features, which are not
205   supported.
206
207-- Lossless RGB JPEG files
208   This requires the SmartScale feature, which is not supported.
209
210What About libjpeg v9?
211----------------------
212
213libjpeg v9 introduced yet another field to the JPEG compression structure
214(color_transform), thus making the ABI backward incompatible with that of
215libjpeg v8.  This new field was introduced solely for the purpose of supporting
216lossless SmartScale encoding.  Further, there was actually no reason to extend
217the API in this manner, as the color transform could have just as easily been
218activated by way of a new JPEG colorspace constant, thus preserving backward
219ABI compatibility.
220
221Our research (see link above) has shown that lossless SmartScale does not
222generally accomplish anything that can't already be accomplished better with
223existing, standard lossless formats.  Thus, at this time, it is our belief that
224there is not sufficient technical justification for software to upgrade from
225libjpeg v8 to libjpeg v9, and therefore, not sufficient technical justification
226for us to emulate the libjpeg v9 ABI.
227
228=====================================
229In-Memory Source/Destination Managers
230=====================================
231
232By default, libjpeg-turbo 1.3 and later includes the jpeg_mem_src() and
233jpeg_mem_dest() functions, even when not emulating the libjpeg v8 API/ABI.
234Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
235API/ABI emulation in order to use the in-memory source/destination managers,
236but several projects requested that those functions be included when emulating
237the libjpeg v6b API/ABI as well.  This allows the use of those functions by
238programs that need them without breaking ABI compatibility for programs that
239don't, and it allows those functions to be provided in the "official"
240libjpeg-turbo binaries.
241
242Those who are concerned about maintaining strict conformance with the libjpeg
243v6b or v7 API can pass an argument of --without-mem-srcdst to configure or
244an argument of -DWITH_MEM_SRCDST=0 to CMake prior to building libjpeg-turbo.
245This will restore the pre-1.3 behavior, in which jpeg_mem_src() and
246jpeg_mem_dest() are only included when emulating the libjpeg v8 API/ABI.
247
248On Un*x systems, including the in-memory source/destination managers changes
249the dynamic library version from 62.0.0 to 62.1.0 if using libjpeg v6b API/ABI
250emulation and from 7.0.0 to 7.1.0 if using libjpeg v7 API/ABI emulation.
251
252Note that, on most Un*x systems, the dynamic linker will not look for a
253function in a library until that function is actually used.  Thus, if a program
254is built against libjpeg-turbo 1.3+ and uses jpeg_mem_src() or jpeg_mem_dest(),
255that program will not fail if run against an older version of libjpeg-turbo or
256against libjpeg v7- until the program actually tries to call jpeg_mem_src() or
257jpeg_mem_dest().  Such is not the case on Windows.  If a program is built
258against the libjpeg-turbo 1.3+ DLL and uses jpeg_mem_src() or jpeg_mem_dest(),
259then it must use the libjpeg-turbo 1.3+ DLL at run time.
260
261Both cjpeg and djpeg have been extended to allow testing the in-memory
262source/destination manager functions.  See their respective man pages for more
263details.
264
265
266*******************************************************************************
267**     Mathematical Compatibility
268*******************************************************************************
269
270For the most part, libjpeg-turbo should produce identical output to libjpeg
271v6b.  The one exception to this is when using the floating point DCT/IDCT, in
272which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
273following reasons:
274
275-- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
276   slightly more accurate than the implementation in libjpeg v6b, but not by
277   any amount perceptible to human vision (generally in the range of 0.01 to
278   0.08 dB gain in PNSR.)
279-- When not using the SIMD extensions, libjpeg-turbo uses the more accurate
280   (and slightly faster) floating point IDCT algorithm introduced in libjpeg
281   v8a as opposed to the algorithm used in libjpeg v6b.  It should be noted,
282   however, that this algorithm basically brings the accuracy of the floating
283   point IDCT in line with the accuracy of the slow integer IDCT.  The floating
284   point DCT/IDCT algorithms are mainly a legacy feature, and they do not
285   produce significantly more accuracy than the slow integer algorithms (to put
286   numbers on this, the typical difference in PNSR between the two algorithms
287   is less than 0.10 dB, whereas changing the quality level by 1 in the upper
288   range of the quality scale is typically more like a 1.0 dB difference.)
289-- If the floating point algorithms in libjpeg-turbo are not implemented using
290   SIMD instructions on a particular platform, then the accuracy of the
291   floating point DCT/IDCT can depend on the compiler settings.
292
293While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
294still using the same algorithms as libjpeg v6b, so there are several specific
295cases in which libjpeg-turbo cannot be expected to produce the same output as
296libjpeg v8:
297
298-- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
299   implements those scaling algorithms differently than libjpeg v6b does, and
300   libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
301
302-- When using chrominance subsampling, because libjpeg v8 implements this
303   with its DCT/IDCT scaling algorithms rather than with a separate
304   downsampling/upsampling algorithm.  In our testing, the subsampled/upsampled
305   output of libjpeg v8 is less accurate than that of libjpeg v6b for this
306   reason.
307
308-- When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
309   "non-smooth") chrominance upsampling, because libjpeg v8 does not support
310   merged upsampling with scaling factors > 1.
311
312
313*******************************************************************************
314**     Performance Pitfalls
315*******************************************************************************
316
317===============
318Restart Markers
319===============
320
321The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
322in a way that makes the rest of the libjpeg infrastructure happy, so it is
323necessary to use the slow Huffman decoder when decompressing a JPEG image that
324has restart markers.  This can cause the decompression performance to drop by
325as much as 20%, but the performance will still be much greater than that of
326libjpeg.  Many consumer packages, such as PhotoShop, use restart markers when
327generating JPEG images, so images generated by those programs will experience
328this issue.
329
330===============================================
331Fast Integer Forward DCT at High Quality Levels
332===============================================
333
334The algorithm used by the SIMD-accelerated quantization function cannot produce
335correct results whenever the fast integer forward DCT is used along with a JPEG
336quality of 98-100.  Thus, libjpeg-turbo must use the non-SIMD quantization
337function in those cases.  This causes performance to drop by as much as 40%.
338It is therefore strongly advised that you use the slow integer forward DCT
339whenever encoding images with a JPEG quality of 98 or higher.
340