Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new top-level compression parameter #609

Merged
merged 20 commits into from
May 5, 2021

Conversation

dmcguire81
Copy link
Contributor

@dmcguire81 dmcguire81 commented Apr 22, 2021

Title

Feature to add compression parameter to override inference by file extension

Motivation

A previous PR was closed for being incompatible with the long-term API design for smart_open.open. Because the long-term goal, discussed in the linked issue is to add an enum-like compression parameter, and deprecate the ignore_ext parameter, which would be a breaking change, this is a backwards-compatible compromise that provides a clean upgrade path. Specifically, it preserves the existing ignore_ext parameter, instead adding a PendingDeprecationWarning, and makes its use mutually-exclusive with the new compression parameter, verified by comprehensive unit testing (the truth tables used to build the test suites are provided in comments for any discussion of refinement needed). On the next major version, the ignore_ext parameter can be removed and the default for compression changed to INFER_FROM_EXTENSION, since it won't be necessary to apply a MUTEX on the parameters, at that point (i.e.: lines 177-182 can just be removed).

Fixes #607.

Tests

Added test_smart_open.HandleS3CompressionTestCase. This is specific to S3 (where the mocking is the best), but could be easily extended to any other backend. However, it looked as though S3 is the go-to backend for verifying shared behavior.

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Linked to any existing issues that your PR will be solving
  • Included tests for any new functionality
  • Checked that all unit tests pass

@dmcguire81
Copy link
Contributor Author

@piskvorky please review based on design discussion

@piskvorky piskvorky requested a review from mpenkov April 27, 2021 18:14
Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution. The changes to the library itself mostly look good. The unit tests need a bit of work. Please have a look at my comments and let me know when you're ready for another round of reviews.

smart_open/smart_open_lib.py Outdated Show resolved Hide resolved
smart_open/smart_open_lib.py Outdated Show resolved Hide resolved
smart_open/smart_open_lib.py Outdated Show resolved Hide resolved
smart_open/tests/test_smart_open.py Outdated Show resolved Hide resolved
smart_open/tests/test_smart_open.py Outdated Show resolved Hide resolved
smart_open/tests/test_smart_open.py Show resolved Hide resolved
smart_open/tests/test_smart_open.py Show resolved Hide resolved
smart_open/compression.py Show resolved Hide resolved
smart_open/compression.py Show resolved Hide resolved
smart_open/compression.py Show resolved Hide resolved
@dmcguire81
Copy link
Contributor Author

@mpenkov ready for another round of reviews

@dmcguire81
Copy link
Contributor Author

@mpenkov @piskvorky ping

@dmcguire81 dmcguire81 requested a review from mpenkov May 3, 2021 20:52
@dmcguire81 dmcguire81 requested a review from mpenkov May 4, 2021 18:11
@mpenkov
Copy link
Collaborator

mpenkov commented May 5, 2021

I'll take it from here. Thank you for your contribution.

@mpenkov mpenkov changed the title Feature to add compression parameter to override inference by file extension Add new top-level compression parameter May 5, 2021
@mpenkov mpenkov merged commit 7716c7d into piskvorky:develop May 5, 2021
mpenkov added a commit that referenced this pull request May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a feature to override file extension defining compression
2 participants