LZX Compression(3) Local Manual LZX Compression(3)NAME
lzx_init, lzx_compress_block, lzx_finish - LZX compression
SYNOPSIS
#include <sys/types.h>
#include <lzx_compress.h>
int
lzx_init(lzx_data ** lzxdp, int wsize_code, lzx_get_bytes_t get_bytes,
void *get_bytes_arg, lzx_at_eof_t at_eof, lzx_put_bytes_t put_bytes,
void *put_bytes_arg, lzx_mark_frame_t mark_frame,
void *mark_frame_arg);
int
lzx_compress_block(lzx_data *lzxd, int block_size, int subdivide);
int
lzx_finish(lzx_data *lzxd, struct lzx_results *lzxr);
void
lzx_reset(lzx_data *lzxd);
DESCRIPTION
The lzx_init(), lzx_compress_block(), and lzx_finish() functions comprise
an compression engine for Microsoft's LZX compression format.
Initializing and releasing the LZX compressor
The lzx_init() function takes a wsize_code to indicate the log (base 2)
of the window size for compression, so 15 is 32K, 16 is 64K, on up to 21
meaning 2MB. It also takes the following callback functions and their
associated arguments:
int get_bytes(void *get_bytes_arg, int n, void *buf)
The lzx_compress_block() routine calls this function when it
needs more uncompressed input to process. The number of bytes
requested is n and the bytes should be placed in the buffer
pointed to by buf. The get_bytes() function should return the
number of bytes actually provided (which must not be greater than
n), nor 0, except at EOF.
int at_eof(void * get_bytes_arg)
Must return 0 if the end of the input data has not been reached,
positive otherwise. Note that this function takes the same argu‐
ment as get_bytes().
int put_bytes(void * put_bytes_arg, int n, void * buf)
The put_bytes() callback is called by lzx_compress() when com‐
pressed bytes need to be output. The number of bytes to be out‐
put is n and the bytes are in the buffer pointed to by buf.
int mark_frame(void *mark_frame_arg, uint32_t uncomp, uint32_t
comp)
The mark_frame() callback is called whenever LZX_FRAME_SIZE
(0x8000) uncompressed bytes have been processed. The current (as
of the last call to put_bytes() ) location in the uncompressed
and compressed data streams are provided in uncomp and comp
respectively. This is intended for .CHM (ITSS) and other similar
files which require a "reset table" listing the frame locations.
This callback is optional; if the mark_frame argument to
lzx_init() is NULL, no function will be called at the end of each
frame.
The lzx_init() function allocates an opaque structure, a pointer to which
will be returned in lzxdp. A pointer to this structure may be passed to
the other LZX compression functions. The function returns negative on
error, 0 otherwise
The lzx_finish() function writes out any unflushed data, releases all
memory held by the compressor (including the lzxd structure) and option‐
ally fills in the lzx_results structure, a pointer to which is passed in
as lzxr (NULL if results are not required)
Running the compressor
The lzx_compress_block() function takes the opaque pointer returned by
lzx_init(), a block_size, and a flag which says whether or not to subdi‐
vide the block. If the subdivide flag is set, blocks may be subdivided
to increase compression ratio based on the entropy of the data at a given
point. Otherwise, just one block is created. Returns negative on error,
0 otherwise.
Note:
The block size must not be larger than the window size. While the
compressor will create apparently-valid LZX files if this restriction
is violated, some decompressors will not handle them.
The lzx_reset() function may be called after any block in order to reset
all compression state except the number of compressed and uncompressed
bytes processed. This forces the one-bit Intel preprocessing header to
be output again, the Lempel-Ziv window to be cleared, and the Huffman
tables to be reset to zero length. It should only be called on a frame
boundary; the results of calling it elsewhere or during a callback are
undefined.
To compress data, simply call lzx_compress_block() and optionally
lzx_reset() repeatedly, handling the various callbacks described above,
until your data is exhausted.
ERRORS
The functions return a negative number on error.
The callbacks are intended to return a negative result on error, but this
is not yet understood by the compressor.
BUGS
The compressor is currently unable to output an uncompressed block, so
incompressible data may expand more than is necessary (though still not
more than is permitted by the CAB standard, 6144 bytes.)
There is no well-defined set of error codes.
There is no way for the callbacks to report an error and abort the com‐
pression.
The algorithm for splitting blocks is suboptimal.
AUTHOR
Matthew T. Russotto
REFERENCES
LZXFMT.DOC — Microsoft LZX Data Compression Format (part of Microsoft
Cabinet SDK)
Comments in cabextract.c, concerning errors in LZXFMT.DOC (part of cabex‐
tract, at ~ http://www.kyz.uklinux.net/cabextract.php3)
CHM file format documentation (~
http://www.speakeasy.net/~russotto/chmformat.html)
SEE ALSOcabextract(1)LOCAL May 27, 2002 LOCAL