Content compression & decompression #636

krizhanovsky · 2016-11-07T08:20:22Z

Depends on #77 (Kernel-User Space Transport).

Content compression and decompression must be implemented. The logic is described by the following options. Using SIMD instruction set if applicable is very wished. See performance benchmarks of vectorization optimizations of zlib.

Since there are many available HTTP compression algorithms, the algorithms must be pluggable. Most probably we should just offload the compression tasks to the 3rd-party libraries.

Consider at least Brotli and Zopfli as more efficient and slow algorithms. gzip should be used for gzip module, brotli for Brotli compression algorithm and so on:

    gzip [0-9]

The option specifies compression level of transferred responses. Default value 0 means no compression at all, other values defines compression level. If compression is enabled, then responses must be stored in cache in compressed form.

    gzip_input [0-9]

Specifies whether to decompress request bodies if Content-Encoding: gzip header is specified. See similar logic in Apache HTTPD.

    gunzip [1,0]

Decompress received responses if they're compressed.

Compression and decompression must be performed in user space using Kernel-User Space Transport. Data compression logic is slow and isn't a mission critical logic. Thus it seems not a good candidate for kernel space. However, there is probable scenario when Tempesta FW is used as a compression offloading proxy, without caching. In this scenario, HTTP messages to be compressed are mapped to user-space for compression and softirq context is switched via GFSM to process other HTTP requests, HTTP error codes and so on.

Web-server mode is assumed to use the loading script which can load two versions, compressed and plain, of a resource.

RFC 7231 5.3.4 says:

    2.  If the representation has no content-coding, then it is
        acceptable by default unless specifically excluded by the
        Accept-Encoding field stating either "identity;q=0" or "*;q=0"
        without a more specific entry for "identity".

Really, if a browsers sends

    Accept-Encoding: gzip, deflate

, then Apache HTTPD still can send Content-Type: text/html; charset=UTF-8, i.e. plain text representation w/o compression. Thus, if the user space compression/decompression threads are overloaded, then in most cases (unless a client explicitly prohibits plain text with identity;q=0) we should send uncompressed content.

Responses must be compressed/decomressed with full skb granularity (i.e. skb with all filled page fragments). So compressing threads return results when full skb is ready or http processing code passes the last HTTP response chunk (i.e. when the response is fully read).

    gzip_type <MIME type>

Defines which content types must be compressed. Default value is text/html.

    gzip_length <min> <max>

Defines response lengths range which must be compressed/decompressed. Default values are 128 and 1400. Maximum size has sense with gzip_threads 0.

Data compression is CPU intensive tasks which can lead to DoS, so following considerations must be taken into account (see the referenced paper for details):

TfwClient must be accounted and limited by Frang in how many bytes were compressed and decompressed (either at request or response processing times) by the client;
There should be one more configuration option gzip_buffer specifying mximum size of decompressed data (see 4.1.1 in the paper). Decompression must be performed by chanks using a buffer of the specified size.
Compression and decompression must be done as late as possible, after all verification tasks.

References

The text was updated successfully, but these errors were encountered:

krizhanovsky added this to the 0.5.0 Web Server milestone Nov 7, 2016

krizhanovsky added the enhancement label Nov 7, 2016

krizhanovsky added the crucial label Nov 17, 2016

krizhanovsky assigned keshonok Dec 28, 2016

krizhanovsky assigned keshonok and unassigned keshonok Jan 4, 2017

krizhanovsky mentioned this issue Feb 11, 2017

Kernel-User Space Transport #77

Open

krizhanovsky modified the milestones: 0.6 WebOS, 0.5.0 Web Server Feb 13, 2017

krizhanovsky unassigned keshonok Mar 19, 2017

krizhanovsky modified the milestones: backlog, 0.11 Tempesta Language Jan 15, 2018

krizhanovsky removed the crucial label Feb 10, 2018

krizhanovsky added the enterprise label Nov 25, 2018

krizhanovsky modified the milestones: 1.5 TBD (Tempesta Language), 1.2 TBD Jan 3, 2022

krizhanovsky modified the milestones: 1.xx TBD, backlog Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content compression & decompression #636

Content compression & decompression #636

krizhanovsky commented Nov 7, 2016 •

edited

Loading

Content compression & decompression #636

Content compression & decompression #636

Comments

krizhanovsky commented Nov 7, 2016 • edited Loading

References

krizhanovsky commented Nov 7, 2016 •

edited

Loading