Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate __exit__ call to underlying filestream #786

Merged
merged 6 commits into from
Feb 21, 2024

Conversation

ddelange
Copy link
Contributor

@ddelange ddelange commented Oct 2, 2023

Title

Propagate __exit__ call to underlying filestream.

No more need for tweak_close function: the double wrapping will now cause three calls to __exit__, which in turn will call close, first on the outermost filestream, working inward.

If a third party compression handler calls close on the inner filestream, the inner __exit__ call will do a second call to close on the inner filestream, which will be ignored.

Motivation

Currently, the compression mechanic has a side-effect that __exit__ is no longer called on the original filestream. Consequently, eg boto3 abort multipart upload is not called, causing lingering blobs that will be billed.

Tests

Work in progress

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Linked to any existing issues that your PR will be solving
  • Included tests for any new functionality
  • Checked that all unit tests pass

Workflow

Please avoid rebasing and force-pushing to the branch of the PR once a review is in progress.
Rebasing can make your commits look a bit cleaner, but it also makes life more difficult from the reviewer, because they are no longer able to distinguish between code that has already been reviewed, and unreviewed code.

@ddelange ddelange force-pushed the nested_exit branch 3 times, most recently from 0586dce to f0cca5d Compare October 3, 2023 12:52
@@ -909,7 +909,8 @@ def write(self, b):

def terminate(self):
"""Cancel the underlying multipart upload."""
assert self._upload_id, "no multipart upload in progress"
if self._upload_id is None:
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

terminate() called before a multipart upload was initiated:

self = <smart_open.tests.test_smart_open.S3OpenTest testMethod=test_write_bad_encoding_strict>

    @moto.mock_s3
    def test_write_bad_encoding_strict(self):
        """Should open the file for writing with the correct encoding."""
        s3 = _resource('s3')
        s3.create_bucket(Bucket='bucket')
        key = "s3://bucket/key.txt"
        text = u'欲しい気持ちが成長しすぎて'
    
        with self.assertRaises(UnicodeEncodeError):
            with smart_open.open(key, 'w', encoding='koi8-r', errors='strict') as fout:
>               fout.write(text)

Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@mpenkov mpenkov merged commit 192845b into piskvorky:develop Feb 21, 2024
21 checks passed
ddelange added a commit to ddelange/smart_open that referenced this pull request Feb 21, 2024
…open into patch-2

* 'develop' of https://github.com/RaRe-Technologies/smart_open:
  Propagate __exit__ call to underlying filestream (piskvorky#786)
  Retry finalizing multipart s3 upload (piskvorky#785)
  Fix `KeyError: 'ContentRange'` when received full content from S3 (piskvorky#789)
  Add support for SSH connection via aliases from `~/.ssh/config` (piskvorky#790)
  Make calls to smart_open.open() for GCS 1000x faster by avoiding unnecessary GCS API call (piskvorky#788)
  Add zstandard compression feature (piskvorky#801)
  Support moto 4 & 5 (piskvorky#802)
  Secure the connection using SSL when connecting to the FTPS server (piskvorky#793)
  upgrade dev status classifier to stable (piskvorky#798)
  Fix formatting of python code (piskvorky#795)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

S3 Multipart uploads comitted on exception when using compression via context manager
2 participants