Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple algorithms; support quality factors #7

Merged
merged 4 commits into from
Oct 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,18 @@ The preferred solution is to have a server (like [Nginx](http://wiki.nginx.org/M

## How it works

Flask-Compress both adds the various headers required for a compressed response and gzips the response data. This makes serving gzip compressed static files extremely easy.
Flask-Compress both adds the various headers required for a compressed response and compresses the response data.
This makes serving compressed static files extremely easy.

Internally, every time a request is made the extension will check if it matches one of the compressible MIME types and will automatically attach the appropriate headers.
Internally, every time a request is made the extension will check if it matches one of the compressible MIME types
and whether the client and the server use some common compression algorithm, and will automatically attach the
appropriate headers.

To determine the compression algorithm, the `Accept-Encoding` request header is inspected, respecting the
quality factor as described in [MDN docs](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding).
If no requested compression algorithm is supported by the server, we don't compress the response. If, on the other
hand, multiple suitable algorithms are found and are requested with the same quality factor, we choose the first one
defined in the `COMPRESS_ALGORITHM` option (see below).


## Installation
Expand Down Expand Up @@ -79,4 +88,4 @@ Within your Flask application's settings you can provide the following settings
| `COMPRESS_CACHE_KEY` | Specifies the cache key method for lookup/storage of response data. | `None` |
| `COMPRESS_CACHE_BACKEND` | Specified the backend for storing the cached response data. | `None` |
| `COMPRESS_REGISTER` | Specifies if compression should be automatically registered. | `True` |
| `COMPRESS_ALGORITHM` | Compression algorithm used: `gzip` or `br`. | `gzip` |
| `COMPRESS_ALGORITHM` | Supported compression algorithms, either comma-separated (`'gzip, br'`) or a list (`['br', 'gzip']`) | `gzip` |
86 changes: 78 additions & 8 deletions flask_compress.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
from gzip import GzipFile
from io import BytesIO

from collections import defaultdict

import brotli
from flask import request, current_app

Expand Down Expand Up @@ -78,17 +80,85 @@ def init_app(self, app):
self.cache = backend() if backend else None
self.cache_key = app.config['COMPRESS_CACHE_KEY']

algo = app.config['COMPRESS_ALGORITHM']
if isinstance(algo, str):
self.enabled_algorithms = [i.strip() for i in algo.split(',')]
else:
self.enabled_algorithms = algo

if (app.config['COMPRESS_REGISTER'] and
app.config['COMPRESS_MIMETYPES']):
app.after_request(self.after_request)

def _choose_compress_algorithm(self, accept_encoding_header):
"""
Determine which compression algorithm we're going to use based on the
client request. The `Accept-Encoding` header may list one or more desired
algorithms, together with a "quality factor" for each one (higher quality
means the client prefers that algorithm more).

:param accept_encoding_header: Content of the `Accept-Encoding` header
:return: Name of a compression algorithm (e.g. `gzip` or `br`) or `None` if
the client and server don't agree on any.
"""
# Map quality factors to requested algorithm names.
algos_by_quality = defaultdict(set)

# A flag denoting that client requested using any (`*`) algorithm,
# in case a specific one is not supported by the server
fallback_to_any = False

for part in accept_encoding_header.lower().split(','):
part = part.strip()
quality = 1.0

if ';q=' in part:
# If the client associated a quality factor with an algorithm,
# try to parse it. We could do the matching using a regex, but
# the format is so simple that it would be overkill.
algo = part.split(';')[0].strip()
try:
quality = float(part.split('=')[1].strip())
except ValueError:
pass
else:
# Otherwise, use the default quality
algo = part

algos_by_quality[quality].add(algo)
fallback_to_any = fallback_to_any or (algo == '*')

# Choose the algorithm with the highest quality factor that the server supports.
#
# If there are multiple equally good options, choose the first supported algorithm
# from server configuration.
#
# If the server doesn't support any algorithm that the client requested but
# there's a special wildcard algorithm request (`*`), choose the first supported
# algorithm.
server_algo_set = set(self.enabled_algorithms)
for _, requested_algo_set in sorted(algos_by_quality.items(), reverse=True):
viable_algos = server_algo_set & requested_algo_set
if len(viable_algos) == 1:
return viable_algos.pop()
elif len(viable_algos) > 1:
for server_algo in self.enabled_algorithms:
if server_algo in viable_algos:
return server_algo
else:
if fallback_to_any:
return self.enabled_algorithms[0]

return None

def after_request(self, response):
app = self.app or current_app
accept_encoding = request.headers.get('Accept-Encoding', '')

chosen_algorithm = self._choose_compress_algorithm(accept_encoding)

if (response.mimetype not in app.config['COMPRESS_MIMETYPES'] or
('gzip' not in accept_encoding.lower() and app.config['COMPRESS_ALGORITHM'] == 'gzip') or
('br' not in accept_encoding.lower() and app.config['COMPRESS_ALGORITHM'] == 'br') or
chosen_algorithm is None or
not 200 <= response.status_code < 300 or
(response.content_length is not None and
response.content_length < app.config['COMPRESS_MIN_SIZE']) or
Expand All @@ -101,14 +171,14 @@ def after_request(self, response):
key = self.cache_key(request)
compressed_content = self.cache.get(key)
if compressed_content is None:
compressed_content = self.compress(app, response)
compressed_content = self.compress(app, response, chosen_algorithm)
self.cache.set(key, compressed_content)
else:
compressed_content = self.compress(app, response)
compressed_content = self.compress(app, response, chosen_algorithm)

response.set_data(compressed_content)

response.headers['Content-Encoding'] = app.config['COMPRESS_ALGORITHM']
response.headers['Content-Encoding'] = chosen_algorithm
response.headers['Content-Length'] = response.content_length

vary = response.headers.get('Vary')
Expand All @@ -120,13 +190,13 @@ def after_request(self, response):

return response

def compress(self, app, response):
if app.config['COMPRESS_ALGORITHM'] == 'gzip':
def compress(self, app, response, algorithm):
if algorithm == 'gzip':
gzip_buffer = BytesIO()
with GzipFile(mode='wb',
compresslevel=app.config['COMPRESS_LEVEL'],
fileobj=gzip_buffer) as gzip_file:
gzip_file.write(response.get_data())
return gzip_buffer.getvalue()
elif app.config['COMPRESS_ALGORITHM'] == 'br':
elif algorithm == 'br':
return brotli.compress(response.get_data())
126 changes: 126 additions & 0 deletions tests/test_flask_compress.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,5 +116,131 @@ def test_content_length_options(self):
response = client.options('/small/', headers=headers)
self.assertEqual(response.status_code, 200)


class CompressionAlgoTests(unittest.TestCase):
"""
Test different scenarios for compression algorithm negotiation between
client and server. Please note that algorithm names (even the "supported"
ones) in these tests **do not** indicate that all of these are actually
supported by this extension.
"""
def setUp(self):
super(CompressionAlgoTests, self).setUp()

# Create the app here but don't call `Compress()` on it just yet; we need
# to be able to modify the settings in various tests. Calling `Compress(self.app)`
# twice would result in two `@after_request` handlers, which would be bad.
self.app = Flask(__name__)
self.app.testing = True

small_path = os.path.join(os.getcwd(), 'tests', 'templates', 'small.html')
self.small_size = os.path.getsize(small_path) - 1

@self.app.route('/small/')
def small():
return render_template('small.html')

def test_setting_compress_algorithm_simple_string(self):
""" Test that a single entry in `COMPRESS_ALGORITHM` still works for backwards compatibility """
self.app.config['COMPRESS_ALGORITHM'] = 'gzip'
c = Compress(self.app)
self.assertListEqual(c.enabled_algorithms, ['gzip'])

def test_setting_compress_algorithm_cs_string(self):
""" Test that `COMPRESS_ALGORITHM` can be a comma-separated string """
self.app.config['COMPRESS_ALGORITHM'] = 'gzip, br, zstd'
c = Compress(self.app)
self.assertListEqual(c.enabled_algorithms, ['gzip', 'br', 'zstd'])

def test_setting_compress_algorithm_list(self):
""" Test that `COMPRESS_ALGORITHM` can be a list of strings """
self.app.config['COMPRESS_ALGORITHM'] = ['gzip', 'br', 'deflate']
c = Compress(self.app)
self.assertListEqual(c.enabled_algorithms, ['gzip', 'br', 'deflate'])

def test_one_algo_supported(self):
""" Tests requesting a single supported compression algorithm """
accept_encoding = 'gzip'
self.app.config['COMPRESS_ALGORITHM'] = ['br', 'gzip']
c = Compress(self.app)
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'gzip')

def test_one_algo_unsupported(self):
""" Tests requesting single unsupported compression algorithm """
accept_encoding = 'some-alien-algorithm'
self.app.config['COMPRESS_ALGORITHM'] = ['br', 'gzip']
c = Compress(self.app)
self.assertIsNone(c._choose_compress_algorithm(accept_encoding))

def test_multiple_algos_supported(self):
""" Tests requesting multiple supported compression algorithms """
accept_encoding = 'br, gzip, zstd'
self.app.config['COMPRESS_ALGORITHM'] = ['zstd', 'br', 'gzip']
c = Compress(self.app)
# When the decision is tied, we expect to see the first server-configured algorithm
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'zstd')

def test_multiple_algos_unsupported(self):
""" Tests requesting multiple unsupported compression algorithms """
accept_encoding = 'future-algo, alien-algo, forbidden-algo'
self.app.config['COMPRESS_ALGORITHM'] = ['zstd', 'br', 'gzip']
c = Compress(self.app)
self.assertIsNone(c._choose_compress_algorithm(accept_encoding))

def test_multiple_algos_with_wildcard(self):
""" Tests requesting multiple unsupported compression algorithms and a wildcard """
accept_encoding = 'future-algo, alien-algo, forbidden-algo, *'
self.app.config['COMPRESS_ALGORITHM'] = ['zstd', 'br', 'gzip']
c = Compress(self.app)
# We expect to see the first server-configured algorithm
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'zstd')

def test_multiple_algos_with_different_quality(self):
""" Tests requesting multiple supported compression algorithms with different q-factors """
accept_encoding = 'zstd;q=0.8, br;q=0.9, gzip;q=0.5'
self.app.config['COMPRESS_ALGORITHM'] = ['zstd', 'br', 'gzip']
c = Compress(self.app)
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'br')

def test_multiple_algos_with_equal_quality(self):
""" Tests requesting multiple supported compression algorithms with equal q-factors """
accept_encoding = 'zstd;q=0.5, br;q=0.5, gzip;q=0.5'
self.app.config['COMPRESS_ALGORITHM'] = ['gzip', 'br', 'zstd']
c = Compress(self.app)
# We expect to see the first server-configured algorithm
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'gzip')

def test_default_quality_is_1(self):
""" Tests that when making mixed-quality requests, the default q-factor is 1.0 """
accept_encoding = 'deflate, br;q=0.999, gzip;q=0.5'
self.app.config['COMPRESS_ALGORITHM'] = ['gzip', 'br', 'deflate']
c = Compress(self.app)
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'deflate')

def test_default_wildcard_quality_is_0(self):
""" Tests that a wildcard has a default q-factor of 0.0 """
accept_encoding = 'br;q=0.001, *'
self.app.config['COMPRESS_ALGORITHM'] = ['gzip', 'br', 'deflate']
c = Compress(self.app)
self.assertEqual(c._choose_compress_algorithm(accept_encoding), 'br')

def test_content_encoding_is_correct(self):
""" Test that the `Content-Encoding` header matches the compression algorithm """
self.app.config['COMPRESS_ALGORITHM'] = ['br', 'gzip']
Compress(self.app)

headers_gzip = [('Accept-Encoding', 'gzip')]
client = self.app.test_client()
response_gzip = client.options('/small/', headers=headers_gzip)
self.assertIn('Content-Encoding', response_gzip.headers)
self.assertEqual(response_gzip.headers.get('Content-Encoding'), 'gzip')

headers_br = [('Accept-Encoding', 'br')]
client = self.app.test_client()
response_br = client.options('/small/', headers=headers_br)
self.assertIn('Content-Encoding', response_br.headers)
self.assertEqual(response_br.headers.get('Content-Encoding'), 'br')


if __name__ == '__main__':
unittest.main()