-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Google Cloud Storage to use blob.open #744
Conversation
Swap to using GCS native blob open under the hood. Reduces code maintenence overhead.
Breaking changes: * Removed gcs.Reader/gcs.Writer classes * No Reader/Writer.terminate() * The buffer size can no-longer be controlled independently of chunk_size * calling close twice on a gcs file object will now throw an exception
Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>
Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>
full diff vs #729, mainly fixed new lint warnings, removed no-op test, and amended docstring: 12e22b7...ddelange:use-gcs-open-fixes |
Thank you @ddelange and @cadnce. @petedannemann Can you please give this a once-over? If all looks good, we can merge. |
Great job everyone, thank you for efforts with this PR @cadnce @ddelange @petedannemann ! |
Thanks everyone for their work, this PR should substantially reduce the amount of maintenance required for this smart-open plugin! |
def terminate(self): | ||
"""Cancel the underlying resumable upload.""" | ||
# | ||
# https://cloud.google.com/storage/docs/xml-api/resumable-upload#example_cancelling_an_upload | ||
# | ||
self._session.delete(self._resumable_upload_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cadnce fyi, starting with smart_open 6.3.0 (which includes this PR), multipart uploads will no longer be cancelled when an exception occurs. I've opened an upstream issue: googleapis/python-storage#1228
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: the functionality will be back in their next major release (3.0)
@@ -37,7 +37,7 @@ def read(fname): | |||
|
|||
|
|||
aws_deps = ['boto3'] | |||
gcs_deps = ['google-cloud-storage>=1.31.0'] | |||
gcs_deps = ['google-cloud-storage>=2.6.0'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for future reference, this version bump relates to googleapis/python-storage#878 and the general maturing of the open
interface in v2
Title
Refactor Google Cloud Storage to use blob.open
Motivation
Supercedes and closes #729, fixes #599
Adds one commit on top, bumps minimum gcs version to avoid breaking changes as discussed:
Tests
Tests discussed in linked tickets
Work in progress
Checklist
Before you create the PR, please make sure you have:
Workflow
Please avoid rebasing and force-pushing to the branch of the PR once a review is in progress.
Rebasing can make your commits look a bit cleaner, but it also makes life more difficult from the reviewer, because they are no longer able to distinguish between code that has already been reviewed, and unreviewed code.