-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-95534: Improve gzip reading speed by 10% #97664
Conversation
Once all the reviewing is done, I will squash the commits. |
Sorry I'm not a gzip expert so I can't review this. However I'd just like to say that you don't need to squash the commits, the core dev will squash them for you when merging! Thanks for the thorough research and a solution to speeding up gzip! |
Thank you. It is good to know that "fix lock stuff" and "reorder code" will not get added to the list of commit messages.
Can you help me with the adress sanitizer? It is not happy about the way I added a new heap type, even though I seem to have done exactly the same thing as for the other heap types in the module. I am at a loss here. It seems to crash in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of my comments are fairly minor, overall I think this code is well written.
Modules/zlibmodule.c
Outdated
"\n" | ||
" wbits = 15\n" | ||
" zdict\n" | ||
" The predefined compression dictionary. This must be the same\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe what type this must be (usually bytes?) so that nobody is confused thinking it is a Python dict.
Consider using O!
in your format below to have the API do a type check for you rather than failing later on when its use is attempted via the buffer API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with O!
is that it cannot correctly accept bytes
and bytearray
simultaneously. Or is there a better solution for this?
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
As stated, most of the code is copied from the Cpython codebase is elsewhere, so I do not want to take credits for it. Thank you for the review. I have updated the code according to your comments. EDIT: I did notice a significant oversight when revisiting: I have not added any testing code for _ZlibDecompressor. This can be very easily be arranged (Python-isal already has a full test suite). On the other hand _ZlibDecompressor's only use case is usage in the gzip module. So it can also be considered sufficiently tested using the current test suite. EDIT: Added the test suite (adapted from test_bz2.py). |
As a heads up, I see this has the "awaiting changes" label. The changes have already been made 12 days ago. |
@rhpvorderman you can re-trigger the bot. See the comment here #97664 (comment). |
Whoop sorry, 🤦♂️ . Should have read that better. Usually I contribute to much smaller projects where such things are commonly not used. I have made the requested changes; please review again |
Thanks for making the requested changes! @gpshead: please review the changes made to this pull request. |
* main: (31 commits) pythongh-95913: Move subinterpreter exper removal to 3.11 WhatsNew (pythonGH-98345) pythongh-95914: Add What's New item describing PEP 670 changes (python#98315) Remove unused arrange_output_buffer function from zlibmodule.c. (pythonGH-98358) pythongh-98174: Handle EPROTOTYPE under macOS in test_sendfile_fallback_close_peer_in_the_middle_of_receiving (python#98316) pythonGH-98327: Reduce scope of catch_warnings() in _make_subprocess_transport (python#98333) pythongh-93691: Compiler's code-gen passes location around instead of holding it on the global compiler state (pythonGH-98001) pythongh-97669: Create Tools/build/ directory (python#97963) pythongh-95534: Improve gzip reading speed by 10% (python#97664) pythongh-95913: Forward-port int/str security change to 3.11 What's New in main (python#98344) pythonGH-91415: Mention alphabetical sort ordering in the Sorting HOWTO (pythonGH-98336) pythongh-97930: Merge with importlib_resources 5.9 (pythonGH-97929) pythongh-85525: Remove extra row in doc (python#98337) pythongh-85299: Add note warning about entry point guard for asyncio example (python#93457) pythongh-97527: IDLE - fix buggy macosx patch (python#98313) pythongh-98307: Add docstring and documentation for SysLogHandler.createSocket (pythonGH-98319) pythongh-94808: Cover `PyFunction_GetCode`, `PyFunction_GetGlobals`, `PyFunction_GetModule` (python#98158) pythonGH-94597: Deprecate child watcher getters and setters (python#98215) pythongh-98254: Include stdlib module names in error messages for NameErrors (python#98255) Improve speed. Reduce auxiliary memory to 16.6% of the main array. (pythonGH-98294) [doc] Update logging cookbook with an example of custom handling of levels. (pythonGH-98290) ...
…thon. The gzip._GzipReader we're inheriting from had some changes to stay up to date with changes in zlib library and to perform some optimization. Specifically: - GzipFile.read has been optimized. There is no longer a unconsumed_tail member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. - _add_read_data has been inlined, as it was just two calls. We've adapted our own code to reflect these changes. More info: python/cpython#97664
…thon. The gzip._GzipReader we're inheriting from had some changes to stay up to date with changes in zlib library and to perform some optimization. Specifically: - GzipFile.read has been optimized. There is no longer a unconsumed_tail member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. - _add_read_data has been inlined, as it was just two calls. We've adapted our own code to reflect these changes. More info: python/cpython#97664
For motivation check the github issue. #95534
Performance figures before (the used gzip file is the tar source distribution from github):
after:
Performance tests where run with all optimizations enabled. I found that
--enable-optimizations
did not influence the result. So it can be verified without all the PGO stuff.Change summary:
gzip.READ_BUFFER_SIZE
constant that is 128KB. Other programs that read in 128KB chunks: pigz and cat. So this seems best practice among good programs. Also it is faster than 8 kb chunks.unconsumed_tail
member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer._add_read_data
has been inlined, as it was just two calls.EDIT: While I am adding improvements anyway, I figured I could add another one-liner optimization now to the python -m gzip application. That read chunks in io.DEFAULT_BUFFER_SIZE previously, but has been updated now to use READ_BUFFER_SIZE chunks.
Results:
before:
After:
For comparison: pigz, the fastest zlib utilizing gzip decompressor, on a single thread. (igzip is faster, but utilizes ISA-L).
If we take the pure C pigz program as baseline, the amount of python overhead is reduced drastically from 30% to 10%.