Support transport-level compression #256

mattrm456 · 2025-03-03T14:34:16Z

Many applications desire to either compress an entire data stream (or all datagrams), or compress the payload portion carried inside a higher-level message protocol. This PR acknowledges this not-usual requirement by providing an API mechanism to deflate/inflate a data stream, and formalizes the requirement that the entire transport is compressed by allowing a user to optionally apply a deflater to all outgoing data sent through an ntci::StreamSocket or ntci::DatagramSocket, and optionally apply an inflater to all incoming data received by an ntci::StreamSocket or ntci::DatagramSocket. Similar to how TLS is integrated, the user "sees" only the uncompressed data. The general idea is that, when a deflater is attached to a socket, all data given to "send" will be first, internally and automatically deflated before attempted to be copied to the socket send buffer/stored on the write queue. Similarly, all data copied from the socket receive buffer will be internally and automatically inflated before being staged in the read queue to be conditionally offered to the user for processing according to their receive criteria and read queue low watermark. Note that compression and encryption can be applied simultaneously; care is taken in the internal implementation to first deflate then encrypt when sending, but first decrypt then inflate when receiving.

As an initial proposal, this PR acknowledges that there are several compression techniques popular when compressing network traffic. To start, this PR supports "zlib", "gzip", "lz4", and "zstd". These techniques are enumerated for ease-of-selection by the user, with a consistent API abstraction over the selected algorithm. For the implementation of these algorithms, this PR is delegates to industry-standard third party libraries to perform the actual compression and decompression. These third-party libraries must be explicitly enabled at build time. If the thirdparty library implementing a selected compression technique was not configured as a dependency at build time, the initialization of the compressor will fail at run-time with a detectable error. Alternatively, users may "plug in" their own compressors implemented however they wish.

To build with internal support for the enumerated compression algorithms, perform the build as:

$ ./configure --with-zlib --with-zstd --with-lz4
$ make
$ make install

This PR introduces the following new components:

ntca_deflateoptions: The parameters that influence the behavior of an operation to compress data.
ntca_deflatecontext: The context in which a deflate operation completes.
ntca_inflateoptions: The parameters that influence the behavior of an operation to decompress data.
ntca_inflatecontext: The context in which a inflate operation completes.
ntca_compressiontype: Enumeration of well-known compression algorithms
ntca_compressiongoal: Enumeration of the desired trade-offs of speed vs. size
ntca_checksumtype: Enumeration of checksums used by the supported compression algorithms
ntca_checksum: Union of different checksum values and streaming update algorithms.
ntci_compression: Abstraction of a mechanism to deflate and inflate data according to a compression algorithm and framing protocol
ntci_compressiondriver: Pluggable factory that produces concrete compressors for a particular algorithm and framing protocol
ntctlc_plugin: Transport level compression; the concrete implementations of an abstract compressor implemented in terms of the thirdparty libraries zlib, liblz4, and libzstd (if configured at build-time.)

This PR integrates automatic compression of a communication through socket by adding new methods to ntci::StreamSocket and ntci::DatagramSocket called setWriteDeflater and setReadInflater. It is permitted to only apply compression in one direction (i.e. outgoing data is compressed but incoming data is not decompressed.) For example, see the usage of d_sendDeflater_sp in ntcr::StreamSocket::send() at ntcr_streamsocket.cpp:5472 and d_receiveInflater_sp in ntcr::StreamSocket::privateDequeueReceiveBuffer() at ntcr_streamsocket.cpp:2922. But note we have many code paths both in ntc{r,p}_streamsocket and ntc{r,p}_datagramsocket that must handle possible deflation and inflation when there is no encryption, and when encryption is also simultaneously enabled.

Compression support is tested in a new testing framework for the ntcf package. This testing machinery is not compiled into the library nor publically installed. Subsequent work will be performed to try to simply some the low-level tests in ntcf_system to be written in terms of this higher-level testing framework. Consider ntcf_test* to be long-term work in progress.

…and ntci::DatagramSocket

…revent cycles

…ure build with zlib, zstd, and lz4 support

…liblz4-1.10.0

…SL handshake state

che2 · 2025-03-05T21:32:35Z

groups/ntc/ntcd/ntcd_compression.cpp

+const bsl::uint32_t CompressionFrameHeader::k_MAGIC = 1380730184;
+#else
+const bsl::uint32_t CompressionFrameHeader::k_MAGIC = 1212501074;


Please add a comment explaining where these magic number comes from.

They don't come from anywhere, that's why they are "magic". Magic numbers correspond to identifiable byte sequences used to help pluck out frame boundaries in hex dumps.

…b buffer has insufficient capacity

…lied during send operation

mattrm456 added 30 commits February 7, 2025 14:03

Implement ntci::Compression and integrate it into ntci::StreamSocket …

e92f252

…and ntci::DatagramSocket

Implement compression

8fab9bb

Before refactoring ntcd_compression

ef4df7b

Experiment with new ntcf test framework

9ba429e

Experiment with new ntcf test framework

18a4b0b

Experiment with new ntcf test framework

1ce667b

Experiment with new ntcf test framework

a6b0ca3

Add ntctlc

6698d98

Test ntctlc

9182f82

Test ntctlc

4f4da18

Test ntctlc

bc28d01

Finish testing ntctlc zlib and gzip compressors

cab812f

Test ntctlc lz4

7bfe9cc

Test ntctlc compression and encryption

a8a24af

Test ntctlc compression and encryption

15008e5

Test ntctlc compression and encryption

e4984e1

Compression and encryption formatting

043bd09

Compression and encryption formatting

b0b4d41

Compression and encryption formatting

f6a644b

Refactor checksum into individual implementation types

af6415f

Cosmetics

8fdb46b

Ensure deflaters and inflaters are cleared during socket closure to p…

c1b9642

…revent cycles

Add ntci::Compression usage example

851b690

Cosmetics

52debee

Cosmetics

fc82525

Fix warnings from MSVC on Windows

b28f427

Github Actions: Add zstd and lz4 dependencies to container and config…

435930f

…ure build with zlib, zstd, and lz4 support

Avoid setting LZ4F_decompressOptions_t::skipChecksums until at least …

4ca5b16

…liblz4-1.10.0

Fix ntctlc_plugin lz4 version check

68fd789

Unity build

c4eb194

mattrm456 added 5 commits March 3, 2025 20:56

GitHub Actions: debug ntca_checksum

50121f0

GitHub Actions: debug ntca_checksum move reset

da641e1

GitHub Actions: fix all test cases

5ed3925

Implement zstd compression

3808d3f

Fix warning in optimized builds when conditionally asserting on OpenS…

cf83799

…SL handshake state

che2 reviewed Mar 5, 2025

View reviewed changes

mattrm456 added 5 commits March 6, 2025 09:44

Attempt to optimize deflation in LZ4

83a3f70

Fall back to deflating into a temporary buffer in LZ4 when output blo…

a16be7c

…b buffer has insufficient capacity

Allow default compression parameters to be set for each socket

0a829cf

Populate ntca::SendContext to inform the user how compression was app…

631e87f

…lied during send operation

Detect LZ4 and ZSTD versions for compile-time function and symbol choice

6b9e584

che2 approved these changes Mar 10, 2025

View reviewed changes

Allow multiple LZ4 blocks to spill over during deflation

319e276

mattrm456 merged commit b7aeddf into bloomberg:main Mar 11, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support transport-level compression #256

Support transport-level compression #256

mattrm456 commented Mar 3, 2025 •

edited

Loading

che2 Mar 5, 2025

mattrm456 Mar 6, 2025

Support transport-level compression #256

Support transport-level compression #256

Conversation

mattrm456 commented Mar 3, 2025 • edited Loading

che2 Mar 5, 2025

Choose a reason for hiding this comment

mattrm456 Mar 6, 2025

Choose a reason for hiding this comment

mattrm456 commented Mar 3, 2025 •

edited

Loading