Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 upload failing silently if the connection lost during upload #244

Closed
navado opened this issue Oct 23, 2018 · 4 comments · Fixed by #434
Closed

S3 upload failing silently if the connection lost during upload #244

navado opened this issue Oct 23, 2018 · 4 comments · Fixed by #434

Comments

@navado
Copy link

navado commented Oct 23, 2018

Code sample:

    with smart_open(furl, 'w') as fout:
        for event in eventsToOffload:
            fout.write(event + "\n")

If connection interrupted before smart_open(furl, ‘w’) as fout:
so the open fails with an exception after 5 retries

2018-10-23 16:46:08,576 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (1): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:46:09,302 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (2): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:46:09,817 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (3): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:46:13,236 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (4): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:46:13,528 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (5): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:46:14,860 - ServiceWritingToS3 - ERROR - Exception {ConnectionError}
2018-10-23 16:46:14,861 - ServiceWritingToS3 - ERROR - Could not connect to the endpoint URL: “https://s3_bucket_for_tests.s3.amazonaws.com/”

But in case the connection interrupted after the stream was successfully opened and few bytes written to the stream, the upload silently completes without any visible error, but obviously with no data actually uploaded.

2018-10-23 16:47:52,259 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (1): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:47:52,974 - ServiceWritingToS3 - INFO - opening stream into: s3://***:***@s3_bucket_for_tests/prefix/2/2018/10/14/157196.csv.gz
2018-10-23 16:47:54,798 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (1): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:47:55,338 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (2): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:47:59,858 - smart_open.s3 - INFO - uploading part #1, 2042 bytes (total 0.000GB)
2018-10-23 16:49:01,271 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (3): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:49:01,873 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (4): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:49:04,649 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (5): s3_bucket_for_tests.s3.amazonaws.com
2018-10-23 16:49:09,998 - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Starting new HTTPS connection (6): s3_bucket_for_tests.s3.amazonaws.com

From the code it looks like the status of the upload not checked in smart_open in smart_open/s3.py#L460.
It's probably better to raise an error in case of the problem to upload a chunk.

@mpenkov
Copy link
Collaborator

mpenkov commented Oct 24, 2018

Good catch @navado . Are you able to make a PR? It's still not too late for #hacktoberfest ;)

@menshikh-iv
Copy link
Contributor

@navado hi, I see that you not participate in #hacktoberfest, just for clarification of @mpenkov comment: if you are interested - check out https://hacktoberfest.digitalocean.com (you can get free T-shirt for 5 PRs in any open-source projects)

@mpenkov
Copy link
Collaborator

mpenkov commented Jan 8, 2020

@navado How are you disabling the underlying internet connection?

I tried to simulate it by turning off my WiFi, and botocore barfed tracebacks all over my console. Those tracebacks are entirely inside botocore and its version of urllib3, so I'm not sure what we can do about them.

@navado
Copy link
Author

navado commented Jan 8, 2020

@mpenkov for example if connected with a cable and then disconnect the cable during transfer. same with mobile roaming and signal loss

@mpenkov mpenkov added this to the 1.10.0 milestone Mar 11, 2020
@mpenkov mpenkov self-assigned this Mar 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants