Commons Compress Pack200 Package

The Pack200 algorithm is not a general purpose compression algorithm but one specialized for compressing JAR archives. JAR archives compressed with Pack200 will in general be different from the original archive when decompressed again. More information can be found in the Javadocs of the Pack200.Packer class.

While the pack200 command line utility of the JDK creates GZip compressed archives (.pack.gz) by default, the streams provided by the Pack200 package only perform the actual Pack200 operation. Wrap them in an additional GzipCompressor(In|Out)putStream in order to deal with deflated streams.

The Pack200-API provided by the java class library is not streaming friendly as it wants to consume its input completely in a single operation. Because of this Pack200CompressorInputStream's constructor will immediately unpack the stream, cache the results and provide an input stream to the cache.

Pack200CompressorOutputStream will cache all data that is written to it and then pack it once the finish or close method is called.

Two different caching modes are available - "in memory", which is the default, and "temporary file". By default data is cached in memory but you should switch to the temporary file option if your archives are really big.

Given there always is an intermediate result the getBytesRead and getCount methods of Pack200CompressorInputStream are meaningless (read from the real stream or from the intermediate result?) and always return 0.

As a pack/unpack cycle may create a JAR archive that is different from the original, digital signatures created for the initial JAR will be broken by the process. There is a way to "normalize" JAR archives prior to packing them that ensures signatures applied to the "normalized" JAR will still be valid aftre a pack/unpack cycle - see Pack200.Packer's javadocs.

The Pack200Utils class in the pack200 package provides several overloads of a normalize method that can be used to prepare a JAR archive in place or to a separate file.