1LZMA SDK 9.38
2-------------
3
4LZMA SDK provides the documentation, samples, header files,
5libraries, and tools you need to develop applications that
6use 7z / LZMA / LZMA2 / XZ compression.
7
8LZMA is an improved version of famous LZ77 compression algorithm.
9It was improved in way of maximum increasing of compression ratio,
10keeping high decompression speed and low memory requirements for
11decompressing.
12
13LZMA2 is a LZMA based compression method. LZMA2 provides better
14multithreading support for compression than LZMA and some other improvements.
15
167z is a file format for data compression and file archiving.
177z is a main file format for 7-Zip compression program (www.7-zip.org).
187z format supports different compression methods: LZMA, LZMA2 and others.
197z also supports AES-256 based encryption.
20
21XZ is a file format for data compression that uses LZMA2 compression.
22XZ format provides additional features: SHA/CRC check, filters for
23improved compression ratio, splitting to blocks and streams,
24
25
26
27LICENSE
28-------
29
30LZMA SDK is written and placed in the public domain by Igor Pavlov.
31
32Some code in LZMA SDK is based on public domain code from another developers:
33  1) PPMd var.H (2001): Dmitry Shkarin
34  2) SHA-256: Wei Dai (Crypto++ library)
35
36You can copy, modify, distribute and perform LZMA SDK code, even for commercial purposes,
37all without asking permission.
38
39LZMA SDK code is compatible with open source licenses, for example, you can
40include it to GNU GPL or GNU LGPL code.
41
42
43LZMA SDK Contents
44-----------------
45
46  Source code:
47
48    - C / C++ / C# / Java   - LZMA compression and decompression
49    - C / C++               - LZMA2 compression and decompression
50    - C / C++               - XZ compression and decompression
51    - C                     - 7z decompression
52    -     C++               - 7z compression and decompression
53    - C                     - small SFXs for installers (7z decompression)
54    -     C++               - SFXs and SFXs for installers (7z decompression)
55
56  Precomiled binaries:
57
58    - console programs for lzma / 7z / xz compression and decompression
59    - SFX modules for installers.
60
61
62UNIX/Linux version
63------------------
64To compile C++ version of file->file LZMA encoding, go to directory
65CPP/7zip/Bundles/LzmaCon
66and call make to recompile it:
67  make -f makefile.gcc clean all
68
69In some UNIX/Linux versions you must compile LZMA with static libraries.
70To compile with static libraries, you can use
71LIB = -lm -static
72
73Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux):
74
75  http://p7zip.sourceforge.net/
76
77
78Files
79-----
80
81DOC/7zC.txt          - 7z ANSI-C Decoder description
82DOC/7zFormat.txt     - 7z Format description
83DOC/installer.txt    - information about 7-Zip for installers
84DOC/lzma.txt         - LZMA compression description
85DOC/lzma-sdk.txt     - LZMA SDK description (this file)
86DOC/lzma-history.txt - history of LZMA SDK
87DOC/lzma-specification.txt - Specification of LZMA
88DOC/Methods.txt      - Compression method IDs for .7z
89
90bin/installer/   - example script to create installer that uses SFX module,
91
92bin/7zdec.exe    - simplified 7z archive decoder
93bin/7zr.exe      - 7-Zip console program (reduced version)
94bin/x64/7zr.exe  - 7-Zip console program (reduced version) (x64 version)
95bin/lzma.exe     - file->file LZMA encoder/decoder for Windows
96bin/7zS2.sfx     - small SFX module for installers (GUI version)
97bin/7zS2con.sfx  - small SFX module for installers (Console version)
98bin/7zSD.sfx     - SFX module for installers.
99
100
1017zDec.exe
102---------
1037zDec.exe is simplified 7z archive decoder.
104It supports only LZMA, LZMA2, and PPMd methods.
1057zDec decodes whole solid block from 7z archive to RAM.
106The RAM consumption can be high.
107
108
109
110
111Source code structure
112---------------------
113
114
115Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption)
116
117C/  - C files (compression / decompression and other)
118  Util/
119    7z       - 7z decoder program (decoding 7z files)
120    Lzma     - LZMA program (file->file LZMA encoder/decoder).
121    LzmaLib  - LZMA library (.DLL for Windows)
122    SfxSetup - small SFX module for installers
123
124CPP/ -- CPP files
125
126  Common  - common files for C++ projects
127  Windows - common files for Windows related code
128
129  7zip    - files related to 7-Zip
130
131    Archive - files related to archiving
132
133      Common   - common files for archive handling
134      7z       - 7z C++ Encoder/Decoder
135
136    Bundles  - Modules that are bundles of other modules (files)
137
138      Alone7z       - 7zr.exe: Standalone 7-Zip console program (reduced version)
139      Format7zExtractR  - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.
140      Format7zR         - 7zr.dll:  Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2
141      LzmaCon       - lzma.exe: LZMA compression/decompression
142      LzmaSpec      - example code for LZMA Specification
143      SFXCon        - 7zCon.sfx: Console 7z SFX module
144      SFXSetup      - 7zS.sfx: 7z SFX module for installers
145      SFXWin        - 7z.sfx: GUI 7z SFX module
146
147    Common   - common files for 7-Zip
148
149    Compress - files for compression/decompression
150
151    Crypto   - files for encryption / decompression
152
153    UI       - User Interface files
154
155      Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll
156      Common   - Common UI files
157      Console  - Code for console program (7z.exe)
158      Explorer    - Some code from 7-Zip Shell extension
159      FileManager - Some GUI code from 7-Zip File Manager
160      GUI         - Some GUI code from 7-Zip
161
162
163CS/ - C# files
164  7zip
165    Common   - some common files for 7-Zip
166    Compress - files related to compression/decompression
167      LZ     - files related to LZ (Lempel-Ziv) compression algorithm
168      LZMA         - LZMA compression/decompression
169      LzmaAlone    - file->file LZMA compression/decompression
170      RangeCoder   - Range Coder (special code of compression/decompression)
171
172Java/  - Java files
173  SevenZip
174    Compression    - files related to compression/decompression
175      LZ           - files related to LZ (Lempel-Ziv) compression algorithm
176      LZMA         - LZMA compression/decompression
177      RangeCoder   - Range Coder (special code of compression/decompression)
178
179
180Note:
181  Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code.
182  7-Zip's source code can be downloaded from 7-Zip's SourceForge page:
183
184  http://sourceforge.net/projects/sevenzip/
185
186
187
188LZMA features
189-------------
190  - Variable dictionary size (up to 1 GB)
191  - Estimated compressing speed: about 2 MB/s on 2 GHz CPU
192  - Estimated decompressing speed:
193      - 20-30 MB/s on modern 2 GHz cpu
194      - 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC)
195  - Small memory requirements for decompressing (16 KB + DictionarySize)
196  - Small code size for decompressing: 5-8 KB
197
198LZMA decoder uses only integer operations and can be
199implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
200
201Some critical operations that affect the speed of LZMA decompression:
202  1) 32*16 bit integer multiply
203  2) Mispredicted branches (penalty mostly depends from pipeline length)
204  3) 32-bit shift and arithmetic operations
205
206The speed of LZMA decompressing mostly depends from CPU speed.
207Memory speed has no big meaning. But if your CPU has small data cache,
208overall weight of memory speed will slightly increase.
209
210
211How To Use
212----------
213
214Using LZMA encoder/decoder executable
215--------------------------------------
216
217Usage:  LZMA <e|d> inputFile outputFile [<switches>...]
218
219  e: encode file
220
221  d: decode file
222
223  b: Benchmark. There are two tests: compressing and decompressing
224     with LZMA method. Benchmark shows rating in MIPS (million
225     instructions per second). Rating value is calculated from
226     measured speed and it is normalized with Intel's Core 2 results.
227     Also Benchmark checks possible hardware errors (RAM
228     errors in most cases). Benchmark uses these settings:
229     (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter.
230     Also you can change the number of iterations. Example for 30 iterations:
231       LZMA b 30
232     Default number of iterations is 10.
233
234<Switches>
235
236
237  -a{N}:  set compression mode 0 = fast, 1 = normal
238          default: 1 (normal)
239
240  d{N}:   Sets Dictionary size - [0, 30], default: 23 (8MB)
241          The maximum value for dictionary size is 1 GB = 2^30 bytes.
242          Dictionary size is calculated as DictionarySize = 2^N bytes.
243          For decompressing file compressed by LZMA method with dictionary
244          size D = 2^N you need about D bytes of memory (RAM).
245
246  -fb{N}: set number of fast bytes - [5, 273], default: 128
247          Usually big number gives a little bit better compression ratio
248          and slower compression process.
249
250  -lc{N}: set number of literal context bits - [0, 8], default: 3
251          Sometimes lc=4 gives gain for big files.
252
253  -lp{N}: set number of literal pos bits - [0, 4], default: 0
254          lp switch is intended for periodical data when period is
255          equal 2^N. For example, for 32-bit (4 bytes)
256          periodical data you can use lp=2. Often it's better to set lc0,
257          if you change lp switch.
258
259  -pb{N}: set number of pos bits - [0, 4], default: 2
260          pb switch is intended for periodical data
261          when period is equal 2^N.
262
263  -mf{MF_ID}: set Match Finder. Default: bt4.
264              Algorithms from hc* group doesn't provide good compression
265              ratio, but they often works pretty fast in combination with
266              fast mode (-a0).
267
268              Memory requirements depend from dictionary size
269              (parameter "d" in table below).
270
271               MF_ID     Memory                   Description
272
273                bt2    d *  9.5 + 4MB  Binary Tree with 2 bytes hashing.
274                bt3    d * 11.5 + 4MB  Binary Tree with 3 bytes hashing.
275                bt4    d * 11.5 + 4MB  Binary Tree with 4 bytes hashing.
276                hc4    d *  7.5 + 4MB  Hash Chain with 4 bytes hashing.
277
278  -eos:   write End Of Stream marker. By default LZMA doesn't write
279          eos marker, since LZMA decoder knows uncompressed size
280          stored in .lzma file header.
281
282  -si:    Read data from stdin (it will write End Of Stream marker).
283  -so:    Write data to stdout
284
285
286Examples:
287
2881) LZMA e file.bin file.lzma -d16 -lc0
289
290compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)
291and 0 literal context bits. -lc0 allows to reduce memory requirements
292for decompression.
293
294
2952) LZMA e file.bin file.lzma -lc0 -lp2
296
297compresses file.bin to file.lzma with settings suitable
298for 32-bit periodical data (for example, ARM or MIPS code).
299
3003) LZMA d file.lzma file.bin
301
302decompresses file.lzma to file.bin.
303
304
305Compression ratio hints
306-----------------------
307
308Recommendations
309---------------
310
311To increase the compression ratio for LZMA compressing it's desirable
312to have aligned data (if it's possible) and also it's desirable to locate
313data in such order, where code is grouped in one place and data is
314grouped in other place (it's better than such mixing: code, data, code,
315data, ...).
316
317
318Filters
319-------
320You can increase the compression ratio for some data types, using
321special filters before compressing. For example, it's possible to
322increase the compression ratio on 5-10% for code for those CPU ISAs:
323x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
324
325You can find C source code of such filters in C/Bra*.* files
326
327You can check the compression ratio gain of these filters with such
3287-Zip commands (example for ARM code):
329No filter:
330  7z a a1.7z a.bin -m0=lzma
331
332With filter for little-endian ARM code:
333  7z a a2.7z a.bin -m0=arm -m1=lzma
334
335It works in such manner:
336Compressing    = Filter_encoding + LZMA_encoding
337Decompressing  = LZMA_decoding + Filter_decoding
338
339Compressing and decompressing speed of such filters is very high,
340so it will not increase decompressing time too much.
341Moreover, it reduces decompression time for LZMA_decoding,
342since compression ratio with filtering is higher.
343
344These filters convert CALL (calling procedure) instructions
345from relative offsets to absolute addresses, so such data becomes more
346compressible.
347
348For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
349
350
351
352---
353
354http://www.7-zip.org
355http://www.7-zip.org/sdk.html
356http://www.7-zip.org/support.html
357