Program that compresses and decompresses ASCII files based on Huffman Coding in a canonical manner.
-
In general, use the following to run the archiver program:
$ make build \ && make test \ && make run $ cd build/src \ && ./archiver --compress test.huff ../../tests/test_1.txt ../../tests/test_2.txt \ && ./archiver --decompress test.huff
-
For local development, you can attempt to use:
$ make local-init && make conan-build
./archiver -h
displays help for using the program../archiver -c archive_name file1 [file2 ...]
encodes the filesfil1, file2, ...
and saves the result to the filearchive_name
../archiver -d archive_name
decodes the files from the archivearchive_name
and puts them in the current directory.
Nine-bit values are written in low-to-high order format (analogous to little-endian for bits). That is, the bit corresponding to 2^0
comes first, followed by 2^1
, and so on, up to the bit corresponding to 2^9
.
The archive file has the following format:
-
A 9-bit number indicating the number of characters in the alphabet
SYMBOLS_COUNT
. -
Data block for recovering the canonical code:
SYMBOLS_COUNT
values of 9 bits (alphabet characters in the order of canonical codes).- A list of
MAX_SYMBOL_CODE_SIZE
values of 9 bits, thei
-th (when numbered from 0) element of which is the number of characters with the code lengthi + 1
.MAX_SYMBOL_CODE_SIZE
, the maximum code length in the current encoding, is not explicitly written to the file because it can be deduced from the available data.
-
The encoded file name.
-
The encoded content of the file.
-
The encoded service symbol
FILENAME_END
. -
If there are additional files in the archive, the encoded service symbol
ONE_MORE_FILE
is used, and the encoding continues. -
The encoded service symbol
ARCHIVE_END
.