The Pack200 algorithm is not a general purpose compression algorithm but one specialized for compressing JAR archives. JAR archives compressed with Pack200 will in general be different from the original archive when decompressed again. More information can be found in the Javadocs of the Pack200.Packer class.
While the pack200
command line utility of the
JDK creates GZip compressed archives (.pack.gz
) by
default, the streams provided by the Pack200 package only
perform the actual Pack200 operation. Wrap them in an
additional GzipCompressor(In|Out)putStream
in order to deal
with deflated streams.
The Pack200-API provided by the java class library is not
streaming friendly as it wants to consume its input completely
in a single operation. Because of this
Pack200CompressorInputStream
's constructor will immediately
unpack the stream, cache the results and provide an input
stream to the cache.
Pack200CompressorOutputStream
will cache all data that
is written to it and then pack it once the finish
or close
method is called.
Two different caching modes are available - "in memory", which is the default, and "temporary file". By default data is cached in memory but you should switch to the temporary file option if your archives are really big.
Given there always is an intermediate result
the getBytesRead
and getCount
methods of Pack200CompressorInputStream
are
meaningless (read from the real stream or from the
intermediate result?) and always return 0.
As a pack/unpack cycle may create a JAR archive that is different from the original, digital signatures created for the initial JAR will be broken by the process. There is a way to "normalize" JAR archives prior to packing them that ensures signatures applied to the "normalized" JAR will still be valid aftre a pack/unpack cycle - see Pack200.Packer's javadocs.
The Pack200Utils
class in the
pack200
package provides several overloads of a
normalize
method that can be used to prepare a
JAR archive in place or to a separate file.