Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload or put object in S3 failing silently #1067

Closed
maxpearl opened this issue Apr 19, 2017 · 39 comments
Closed

Upload or put object in S3 failing silently #1067

maxpearl opened this issue Apr 19, 2017 · 39 comments
Assignees
Labels

Comments

@maxpearl
Copy link

maxpearl commented Apr 19, 2017

I've been trying to upload files from a local folder into folders on S3 using Boto3, and it's failing kinda silently, with no indication of why the upload isn't happening.

key_name = folder + '/' 
 s3_connect = boto3.client('s3', s3_bucket_region,)
 # upload File to S3
 for filename in os.listdir(folder):
     s3_name = key_name + filename
     print folder, filename, key_name, s3_name
     upload = s3_connect.upload_file(
         s3_name, s3_bucket, key_name,
     )

Printing upload just says "None", with no other information. No upload happens. I've also tried using put_object:

put_obj = s3_connect.put_object(
        Bucket=s3_bucket, 
        Key=key_name,
        Body=s3_name,
    )

and I get an HTTPS response code of 200 - but no files upload.

First, I'd love to solve this problem, but second, it seems this isn't the right behavior - if an upload doesn't happen, there should be a bit more information about why (although I imagine this might be a limitation of the API?)

@jamesls
Copy link
Member

jamesls commented Apr 19, 2017

Could you share a santized version of your debug logs? That should have more information about what's going on. You can enable debug logs by adding boto3.set_stream_logger('').

@jamesls jamesls added the response-requested Waiting on additional information or feedback. label Apr 19, 2017
@maxpearl
Copy link
Author

OK, here are the debug logs.

2017-04-24 09:27:29,491 botocore.credentials [DEBUG] Looking for credentials via: env
2017-04-24 09:27:29,491 botocore.credentials [DEBUG] Looking for credentials via: assume-role
2017-04-24 09:27:29,492 botocore.credentials [DEBUG] Looking for credentials via: shared-credentials-file
2017-04-24 09:27:29,492 botocore.credentials [INFO] Found credentials in shared credentials file: ~/.aws/credentials
2017-04-24 09:27:29,492 botocore.loaders [DEBUG] Loading JSON file: /usr/local/lib/python2.7/dist-packages/botocore/data/endpoints.json
2017-04-24 09:27:29,528 botocore.loaders [DEBUG] Loading JSON file: /usr/local/lib/python2.7/dist-packages/botocore/data/s3/2006-03-01/service-2.json
2017-04-24 09:27:29,572 botocore.loaders [DEBUG] Loading JSON file: /usr/local/lib/python2.7/dist-packages/botocore/data/_retry.json
2017-04-24 09:27:29,573 botocore.client [DEBUG] Registering retry handlers for service: s3
2017-04-24 09:27:29,576 botocore.hooks [DEBUG] Event creating-client-class.s3: calling handler <function add_generate_presigned_post at 0x7fc9eebfe758>
2017-04-24 09:27:29,576 botocore.hooks [DEBUG] Event creating-client-class.s3: calling handler <function _handler at 0x7fc9eeb76938>
2017-04-24 09:27:29,585 botocore.hooks [DEBUG] Event creating-client-class.s3: calling handler <function add_generate_presigned_url at 0x7fc9eebfae60>
2017-04-24 09:27:29,585 botocore.args [DEBUG] The s3 config key is not a dictionary type, ignoring its value of: None
2017-04-24 09:27:29,589 botocore.endpoint [DEBUG] Setting s3 timeout as (60, 60)
2017-04-24 09:27:29,589 botocore.client [DEBUG] Defaulting to S3 virtual host style addressing with path style addressing fallback.
2017-04-24 09:27:29,590 s3transfer.utils [DEBUG] Acquiring 0
2017-04-24 09:27:29,593 s3transfer.tasks [DEBUG] UploadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fc9ee3df410>}) about to wait for the following futures []
2017-04-24 09:27:29,593 s3transfer.tasks [DEBUG] UploadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fc9ee3df410>}) done waiting for dependent futures
2017-04-24 09:27:29,593 s3transfer.tasks [DEBUG] Executing task UploadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fc9ee3df410>}) with kwargs {'osutil': <s3transfer.utils.OSUtils object at 0x7fc9ee90f6d0>, 'client': <botocore.client.S3 object at 0x7fc9ee9df390>, 'config': <boto3.s3.transfer.TransferConfig object at 0x7fc9ee90f750>, 'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fc9ee3df410>, 'request_executor': <s3transfer.futures.BoundedExecutor object at 0x7fc9ee90fc50>}
2017-04-24 09:27:29,594 s3transfer.futures [DEBUG] Submitting task PutObjectTask(transfer_id=0, {'extra_args': {}, 'bucket': 'xxxxxxx', 'key': 'SFORG0/'}) to executor <s3transfer.futures.BoundedExecutor object at 0x7fc9ee90fc50> for transfer request: 0.
2017-04-24 09:27:29,594 s3transfer.utils [DEBUG] Acquiring 0
2017-04-24 09:27:29,594 s3transfer.utils [DEBUG] Releasing acquire 0/None
2017-04-24 09:27:29,594 s3transfer.tasks [DEBUG] PutObjectTask(transfer_id=0, {'extra_args': {}, 'bucket': 'xxxxxxx', 'key': 'SFORG0/'}) about to wait for the following futures []
2017-04-24 09:27:29,594 s3transfer.tasks [DEBUG] PutObjectTask(transfer_id=0, {'extra_args': {}, 'bucket': 'xxxxxxx', 'key': 'SFORG0/'}) done waiting for dependent futures
2017-04-24 09:27:29,594 s3transfer.tasks [DEBUG] Executing task PutObjectTask(transfer_id=0, {'extra_args': {}, 'bucket': 'xxxxxxx', 'key': 'SFORG0/'}) with kwargs {'extra_args': {}, 'client': <botocore.client.S3 object at 0x7fc9ee9df390>, 'bucket': 'xxxxxxx', 'key': 'SFORG0/', 'fileobj': <s3transfer.utils.ReadFileChunk object at 0x7fc9ee3df790>}
2017-04-24 09:27:29,595 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function validate_ascii_metadata at 0x7fc9eebae320>
2017-04-24 09:27:29,595 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function sse_md5 at 0x7fc9eebaa848>
2017-04-24 09:27:29,595 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function convert_body_to_file_like_object at 0x7fc9eebae938>
2017-04-24 09:27:29,596 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function validate_bucket_name at 0x7fc9eebaa7d0>
2017-04-24 09:27:29,596 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x7fc9ee90f8d0>>
2017-04-24 09:27:29,596 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function generate_idempotent_uuid at 0x7fc9eebaa488>
2017-04-24 09:27:29,596 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <function conditionally_calculate_md5 at 0x7fc9eebaa758>
2017-04-24 09:27:29,596 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <function add_expect_header at 0x7fc9eebaac08>
2017-04-24 09:27:29,596 botocore.handlers [DEBUG] Adding expect 100 continue header to request.
2017-04-24 09:27:29,596 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <bound method S3RegionRedirector.set_request_url of <botocore.utils.S3RegionRedirector object at 0x7fc9ee90f8d0>>
2017-04-24 09:27:29,597 botocore.endpoint [DEBUG] Making request for OperationModel(name=PutObject) (verify_ssl=True) with params: {'body': <s3transfer.utils.ReadFileChunk object at 0x7fc9ee3df790>, 'url': u'https://s3.amazonaws.com/xxxxxxx/SFORG0/', 'headers': {'Content-MD5': u'LS6xnKmV+8TFGwrKJfZ3jw==', 'Expect': '100-continue', 'User-Agent': 'Boto3/1.4.3 Python/2.7.12 Linux/4.4.0-72-generic Botocore/1.4.92'}, 'context': {'client_region': 'us-east-1', 'signing': {'bucket': 'xxxxxxx'}, 'has_streaming_input': True, 'client_config': <botocore.config.Config object at 0x7fc9ee90f510>}, 'query_string': {}, 'url_path': u'/xxxxxxx/SFORG0/', 'method': u'PUT'}
2017-04-24 09:27:29,597 botocore.hooks [DEBUG] Event request-created.s3.PutObject: calling handler <function disable_upload_callbacks at 0x7fc9ee94f500>
2017-04-24 09:27:29,597 botocore.hooks [DEBUG] Event request-created.s3.PutObject: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7fc9ee90f4d0>>
2017-04-24 09:27:29,597 botocore.hooks [DEBUG] Event before-sign.s3.PutObject: calling handler <function fix_s3_host at 0x7fc9ef1771b8>
2017-04-24 09:27:29,597 botocore.utils [DEBUG] Checking for DNS compatible bucket for: https://s3.amazonaws.com/xxxxxxx/SFORG0/
2017-04-24 09:27:29,597 botocore.utils [DEBUG] URI updated to: https://xxxxxxx.s3.amazonaws.com/SFORG0/
2017-04-24 09:27:29,597 botocore.auth [DEBUG] Calculating signature using hmacv1 auth.
2017-04-24 09:27:29,597 botocore.auth [DEBUG] HTTP request method: PUT
2017-04-24 09:27:29,597 botocore.auth [DEBUG] StringToSign:
PUT
LS6xnKmV+8TFGwrKJfZ3jw==

Mon, 24 Apr 2017 16:27:29 GMT
/xxxxxxx/SFORG0/
2017-04-24 09:27:29,597 botocore.hooks [DEBUG] Event request-created.s3.PutObject: calling handler <function enable_upload_callbacks at 0x7fc9ee94f578>
2017-04-24 09:27:29,598 botocore.endpoint [DEBUG] Sending http request: <PreparedRequest [PUT]>
2017-04-24 09:27:29,599 botocore.vendored.requests.packages.urllib3.connectionpool [INFO] Starting new HTTPS connection (1): xxxxxxx.s3.amazonaws.com
2017-04-24 09:27:30,981 botocore.awsrequest [DEBUG] Waiting for 100 Continue response.
2017-04-24 09:27:32,966 botocore.awsrequest [DEBUG] 100 Continue response seen, now sending request body.
2017-04-24 09:27:32,686 botocore.vendored.requests.packages.urllib3.connectionpool [DEBUG] "PUT /SFORG0/ HTTP/1.1" 200 0
2017-04-24 09:27:32,686 botocore.parsers [DEBUG] Response headers: {'content-length': '0', 'x-amz-id-2': 'lzrvMB3ofUZI6lO4n4IqgEfl263R1aWfhCgnRU79ooJs+ovbeaHdfui9GothMyXQKcDBoiyZ270=', 'server': 'AmazonS3', 'x-amz-request-id': '0390E37A2E403D08', 'etag': '"2d2eb19ca995fbc4c51b0aca25f6778f"', 'date': 'Mon, 24 Apr 2017 16:27:32 GMT'}
2017-04-24 09:27:32,687 botocore.parsers [DEBUG] Response body:

2017-04-24 09:27:32,687 botocore.hooks [DEBUG] Event needs-retry.s3.PutObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7fc9ee9b8050>
2017-04-24 09:27:32,687 botocore.retryhandler [DEBUG] No retry needed.
2017-04-24 09:27:32,687 botocore.hooks [DEBUG] Event needs-retry.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7fc9ee90f8d0>>
2017-04-24 09:27:32,687 s3transfer.utils [DEBUG] Releasing acquire 0/None

@PrettyCities
Copy link

I'm experiencing the same behavior. Please let me know if you would like my debug logs as well.

@maxpearl
Copy link
Author

Any update on this?

@iyawnis
Copy link

iyawnis commented Jul 20, 2017

I have the same issue with files that were meant to be uploaded by a background process not being available. Is there any way of knowing if an upload has been successful, without making another S3 request?

@pgrzesik
Copy link

Same problem here - is there any solution to that ?

@maxpearl
Copy link
Author

So just for your information (and perhaps this will help someone figure out what's up,) using the Service Resource instead of the client works fine. The upload into S3 "folders" works perfectly. Here's some example code:

import boto3

testfile = 'test.csv'
bucket_name = 'bucket-name-here'
folder_name = 'folder-name'

key = folder_name + '/' + testfile
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)

bucket.upload_file(testfile, key)

So at least there is a workaround for the problem with the client.

@iyawnis
Copy link

iyawnis commented Aug 22, 2017

@michellemurrain will this raise an error if upload fails ?

@maxpearl
Copy link
Author

maxpearl commented Aug 22, 2017

It raises an error if, for instance, the bucket doesn't exist, or there is something wrong with the key or the file, but to my knowledge that wasn't a problem with the client API either. It hasn't failed to do an upload in normal circumstances using "folders" (that is, with a valid bucket, key and file), so I can't say whether it will raise an error if it does fail (I'll definitely update this ticket if I run into that.)

@iyawnis
Copy link

iyawnis commented Aug 23, 2017

The client will raise an error in those scenarios as well. The problem that we have had is different. I haven't actually seen it while happening ( always in background processes), but the upload process is completed without error, and the file is not there. At this stage I can only make assumption that this may be related on connection issues, but it feels unlikely as the code is running on EC2, same region as the S3.. In any case needs more investigation on our end as well.

@maxpearl
Copy link
Author

The only situation in which I've ever had that behavior (upload fails, no error) was using the client to store files in S3 in a folder structure. Storing files in S3 using the client w/o a folder, or using the service resource hasn't led to any issues in file uploads. We're also using an EC2 to run this code.

@iyawnis
Copy link

iyawnis commented Aug 24, 2017

hm, we are storing the files inside folders as well, using client('s3'), will try to get time and switch to resource and then see if the issue persists.

@PrettyCities
Copy link

In my case, it seems as if uploads were not failing. Rather, there was a gap between the time when the client reported success and the files actually appeared in S3. Not sure if this is a network issue or perhaps something related to background processes not firing immediately as @latusaki referred to.

@keremgocen
Copy link

I’m having a very similar silent failing upload issue, was it any help using resources @latusaki?

@iyawnis
Copy link

iyawnis commented May 17, 2018

We added logic to trigger re-fetch and upload on the endpoint that downloads S3 files instead of the upload. Upload is done using client.upload_fileobj. I can't really recall any details of the issue mentioned, haven't worked on this code since. It's likely that the problem was elsewhere rather than S3.

@keremgocen
Copy link

So it was not using resource instead of client what fixed the problem. I'll try to debug the response from s3 locally before attempting multi-part uploads, i have no clue yet what the cause is but the upload for small files are working flawlessly. Upload for files larger than 10MB was successful only once out of ~20 attempts.

@maxpearl
Copy link
Author

I do think of using resource instead of client as a work-around, rather than a fix. I don't know what's going on at the code level to make using client a problem. I'm hoping to actually dig into this sometime in the next couple of weeks.

@keremgocen
Copy link

In my case locally there were no issues with files up to 100MB, which is failing in our staging environment. The difference is our s3 uploads is triggered via django views, which are also wrapped by gunicorn workers and they also seem to have a default 30 second timeout window. Which could be our root cause, not the S3 or boto library itself.

@bluppfisk
Copy link

bluppfisk commented Jun 20, 2018

Having a leading slash in your path to the object will create a folder with a blank name in your bucket root, which holds the uploads.
e.g.:

path = "/uploads/"
s3.Object(my_bucket, path + filename).put(Body=data)

will create / /uploads in your bucket root (note the blank folder) and put the object in there rather than in /uploads. Since boto was also spamming the bucket root with odd files, I didn't notice the empty folder until much later and lost a lot of time. Could be your problem, too?

@gladsonvm
Copy link

gladsonvm commented Jul 2, 2018

@maxpearl #1067 (comment) saved my day. But I still think put_object not working properly must be considered as a known bug. Interesting and weird thing is that this issue is intermittent with s3_put_object

@yahavb
Copy link

yahavb commented Sep 3, 2018

+1 I am also getting timeouts for calls to s3 using boto3. I tried all the permutations mentioned above with no luck. i.e., uploading as resource, object etc. Even list_buckets() call timeouts. However, the put or upload calls are uploading the files to the bucket but it timeout.
Is anybody had any lead on what the issue is?

@yahavb
Copy link

yahavb commented Sep 5, 2018

my issue was caused by lambda disable internet access when explicit vpc setting. It has nothing to do with s3 or boto3. Sorry for the confusion!

https://medium.com/@philippholly/aws-lambda-enable-outgoing-internet-access-within-vpc-8dd250e11e12

@cagriar
Copy link

cagriar commented May 20, 2019

I'm experiencing the same problem when both using client and resource objects. Have you experienced a similar issue on your workaround as well @maxpearl ?

@maxpearl
Copy link
Author

Hi @cagriar - the issue only arose using the client, and since we switched to using the resource, we haven't run into this again at all.

@pawciobiel
Copy link

If region is not specified or miss-configured upload_fileobj fail silently.

conn = boto3.client('s3')
conn.upload_fileobj(content, bucket_name, filename)
return True

method PUT, HTTP response status code 400:

<?xml version="1.0" encoding="UTF-8"?>
        <Error><Code>AuthorizationHeaderMalformed</Code><Message>The authorization
        header is malformed; the region ''us-east-1'' is wrong; expecting ''eu-west-1''</Message><Region>
eu-west-1</Region><RequestId>AAAAA</RequestId><HostId>BBBBB</HostId></Error>
boto3==1.9.151
botocore==1.12.151

Could this be raised as S3Exceptions or ClientError please?

@E-G-C
Copy link

E-G-C commented Nov 11, 2019

In my case it will fail silently, won't create the object, no exception, unless the object's key has a dot in it, for example

import boto3
s3 = boto3.resource('s3')

content = 'hello world'
bytesIO = BytesIO()
bytesIO.write(content.encode('utf8'))
bytesIO.seek(0)

s3_object = s3.Object(bucket_name='your-bucket-name', key ='myFolder/filename')
result = s3_object.put('rb', Body=bytesIO)

However, changing the key to .myFolder/filename or myFolder/filename.txt will do it

@geneyx
Copy link

geneyx commented Nov 30, 2019

Had the same issue and using resource solve the issue.
client was returning None response, which I couldn't find any documentation for that.

@swetashre swetashre added s3 and removed response-requested Waiting on additional information or feedback. labels Mar 2, 2020
@swetashre
Copy link
Contributor

swetashre commented Mar 2, 2020

@maxpearl- Following up on this issue. According to this code snippet

key_name = folder + '/' 
 s3_connect = boto3.client('s3', s3_bucket_region,)
 # upload File to S3
 for filename in os.listdir(folder):
     s3_name = key_name + filename
     print folder, filename, key_name, s3_name
     upload = s3_connect.upload_file(
         s3_name, s3_bucket, key_name,
     )

Here key_name will be folder/ , so all your file will be uploaded with that name. So if you go inside folder/ then there won't be any file because your file is getting uploaded as name folder/ . Here key_name should be folder/filename instead of folder/.

So if you replace key_name with s3_name then you will be able to see all your file in that folder.

And this code snippet

import boto3

testfile = 'test.csv'
bucket_name = 'bucket-name-here'
folder_name = 'folder-name'

key = folder_name + '/' + testfile
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)

bucket.upload_file(testfile, key)

is working because here you have mentioned key as folder/testfile. In this case if you specify key as folder/ then you would be able to reproduce the same behavior as you are seeing with client.

Please let me know if you have any questions.

@swetashre swetashre added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Mar 2, 2020
@swetashre swetashre self-assigned this Mar 2, 2020
@maxpearl
Copy link
Author

maxpearl commented Mar 3, 2020

You are correct, the code way above (a few years old now) is incorrect. I did so much testing of the issue - I have no idea which iteration of testing that particular code is from. Theoretically, though, it shouldn't have failed silent - it should have just created the folder (which it did not do in my testing).

I did testing just now, and I have been unable at this time, with either python 2 or 3, to re-create the issue I had before - either with correct or incorrect code - a key name with a folder without a file will correctly create the folder, and the client creates a file just fine. I will close this now.

@maxpearl maxpearl closed this as completed Mar 3, 2020
@no-response no-response bot removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Mar 3, 2020
@skghosh-invn
Copy link

In my case, using boto3.client, I have observed that if any intermediate folder is missing then client upload fails silently to output the file. For example I am uploading files like

a -> <bucket>/f1/f2/a
b -> <bucket>/f1/f2/b

# This Fails silently as the folder "f-new" does not exist. Since S3 is everything object based - we still need to create "f-new"?
c -> <bucket>/f1/f-new/c

@monkut
Copy link

monkut commented Jun 23, 2020

Having an issue with this now. it seems I had this before and referenced this issue. At the time, changing from using client to resource resolved the issue for me.

But now I'm experiencing the same thing with the bucket Resource.

I'm trying to upload a single file with no directory in the key name. just {uuid4}.zip

@komlanA
Copy link

komlanA commented Oct 21, 2021

For anyone that comes across this thread, I was having the same issues. I thought that upload_file was failing silently, but the reason I could not see the key after uploading it was because I was using list_objects to list all my keys. list_objects has a limit on how many keys it will return, so I fixed this by using paginator and was able to see my keys after.

@monkut
Copy link

monkut commented Jun 5, 2023

Why am I back here. Seeing the same issue now with s3client.upload_file() , and this isn't new code.
I suspect this is because I just updated boto3 to 1.26.146.

Changed from using upload_file() to use upload_fileobj(), but seeing the same silent failing.

Also tried to set the region via AWS_DEFAULT_REGION but that had no effect on the siilent faililng.

@avlm
Copy link

avlm commented Jul 10, 2023

same issue here! I also tried: s3_resource.Object(destination_bucket, destination_key).copy_from(CopySource=f'{source_bucket}/{source_key}') but it didn't work. s3_client.copy_object and s3_client.put_object didn't work too.

does anybody have an update here?

@lb1mg
Copy link

lb1mg commented Jan 31, 2024

Facing this issue in prod! Setting custom checks to raise exceptions. Can we please reopen this issue?!

@Rahulreddy1020
Copy link

any solution for this, i am having the same issue

@tim-finnigan
Copy link
Contributor

@Rahulreddy1020 could you create a new issue for further investigation? Please share a code snippet for reproducing the issue and debug logs (with any sensitive info redacted) which you can get by adding boto3.set_stream_logger('') to your script. This issue is quite old and has a lot of comments so I think it would be easier to continue discussion in a new issue.

@happinessbaby
Copy link

Make sure your key's folder name doesn't start with a slash, it should be a relative path. That was my mistake.

@stephenscliu
Copy link

Make sure your key's folder name doesn't start with a slash, it should be a relative path. That was my mistake.

Removing the leading slash in the key also solved my issue (previously upload_file was failing silently, no error/exception but object didn't upload)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests