-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make close method idempotent + add dummy flush method for S3
#212
Conversation
updated gen_schema function
2 modifications to correct schema parse issue
We need this method in order for S3 to mimic local files. According to the docs, this method must: > Flush the write buffers of the stream if applicable. This does nothing > for read-only and non-blocking streams. Since we're doing a multipart upload to S3, it's impossible to flush the buffers. The flushed content would not be available until the user calls the close method. So, in our case, the flush method does nothing.
fout.close() | ||
fout.close() | ||
|
||
def test_flush_close(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's an idea of this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_flush_close checks if we have a flush method.
test_double_close checks that calling close twice doesn't cause a crash.
Both of these tests were written as part of TDD. They capture the problematic behavior of smart open before the corresponding changes to s3 were made.
smart_open/s3.py
Outdated
@@ -472,6 +473,14 @@ def __exit__(self, exc_type, exc_val, exc_tb): | |||
self.close() | |||
|
|||
|
|||
def _insert_spaces(the_string): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's a function _insert_spaces
(here and below)? I see no usages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used for debugging only. Deleted.
integration-tests/s3_cp.py
Outdated
import smart_open | ||
|
||
s3_url, local_path = sys.argv[1:] | ||
with smart_open.smart_open(s3_url, 'rb') as fin, open(local_path, 'wb') as fout: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description needed: I see copy s3 -> local disk, what you test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an analog of the "aws s3 cp" command. I used it for debugging only, so I deleted it.
S3
Thank you @mpenkov 👍 |
According to the python docs, the close method should have no effect if the file is already closed.
Also, we need to expose a flush method in order to quack like a built-in file object.
There are many commits in this branch, most of them are just back-and-forth between me and the reporting user. It took us a long time to understand what the problem actually was.