Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent PutObject Failures #767

Closed
grapevine2383 opened this issue Sep 16, 2015 · 20 comments
Closed

Intermittent PutObject Failures #767

grapevine2383 opened this issue Sep 16, 2015 · 20 comments

Comments

@grapevine2383
Copy link

Everyday I get 1-10 of these errors out of a couple thousand putObject requests and the putObject fails. I have thousands more of requests using curl to other servers daily with no problems so I don't see it being my server at fault right now.

Code (PHP 5.5.28, Apache 2.4.16)

include('composer/vendor/autoload.php');// latest version of SDK 3.3.6
$s3=new Aws\S3\S3Client(['version'=>'latest','region'=>'us-east-1','credentials'=>['key'=>KEY,'secret'=>SECRET],'scheme'=>'http']);
$s3->putObject(['Key'=>$name,'Bucket'=>BUCKET,'Body'=>$image_data,'ACL'=>'public-read','ContentType'=>'image/jpg')]);//fails here intermittently

Exception (Everyday 1-10 of these similar messages)

PHP Fatal error: Uncaught exception 'Aws\S3\Exception\S3Exception' with message 'Error executing "PutObject" on "http://s3.amazonaws.com/mybucket/image055f80f4f085ee.jpg"; AWS HTTP error: cURL error 56: Recv failure: Connection reset by peer (see http://curl.haxx.se/libcurl/c/libcurl-errors.html)'

exception 'GuzzleHttp\Exception\RequestException' with message 'cURL error 56: Recv failure: Connection reset by peer (see http://curl.haxx.se/libcurl/c/libcurl-errors.html)' in /home/public_html/composer/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php:187
Stack trace:
#0 /home/public_html/composer/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(150): GuzzleHttp\Handler\CurlFactory::createRejection(Object(GuzzleHttp\Handler\EasyHandle), Array)
#1 /home/public_html/composer/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(103): GuzzleHttp\Handler\CurlFactory::finishError(Object(GuzzleHttp\Handler\CurlMultiHandler), Object(GuzzleHttp\Handler\EasyHandle), Obj in /home/public_html/composer/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php on line 152

@jeskew
Copy link
Contributor

jeskew commented Sep 16, 2015

Reset connection errors are retried up to three times before the request fails. You could try increasing that value by passing adding a retries key in the options array being passed to the client constructor:

$s3=new Aws\S3\S3Client([
    'version' => 'latest',
    'region' => 'us-east-1',
    'credentials' => [
        'key' => KEY,
        'secret' => SECRET
    ],
    'scheme' => 'http',
    'retries' => 11,
]);

I suspect that you would see fewer reset connections if you allowed the SDK to use https, but I don't have any numbers to back that up.

@grapevine2383
Copy link
Author

Thanks for the response. I was using https before and switched to http a week ago to see if that would help. I will try increasing the retries and let you know how that goes.

@grapevine2383
Copy link
Author

Increasing retries to 11 actually doubled the daily amount of errors. Instead of curl errors now it's returning 500 errors:

PHP Fatal error:  Uncaught exception 'Aws\S3\Exception\S3Exception' with message 'Error executing "PutObject" on "http://s3.amazonaws.com/mybucket/sdsadadster.jpg"; AWS HTTP error: Server error: 500 InternalError (server): We encountered an internal error. Please try again. - <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>0021CF042749BE89</RequestId><HostId>jhdJOv7Iv4203YhvFGiTMu1AJs7xW5t+N/MOaMiaYBgrBmQbqHUoMYHGtBpfr9MYAkRNJSkHVM8=</HostId></Error>'

exception 'GuzzleHttp\Exception\ServerException' with message 'Server error: 500' in /home/public_html/composer/vendor/guzzlehttp/guzzle/src/Middleware.php:68
Stack trace:
#0 /home/public_html/composer/vendor/guzzlehttp/promises/src/Promise.php(199): GuzzleHttp\Middleware::GuzzleHttp\{closure}(Object(GuzzleHttp\Psr7\Response))
#1 /home/public_html/composer/vendor/guzzlehttp/promises/src/Promise.php(152): GuzzleHttp\Promi in /home/public_html/composer/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php on line 152

@grapevine2383
Copy link
Author

Searching into 500 errors brings me to the "Handling Errors" section of https://aws.amazon.com/articles/1904 and they recommend exponential backoff. Is this built into the PHP SDK? I would have thought they would have figured it out after 7 years with S3, my daily peak is only around 20 requests/second.

@jeskew
Copy link
Contributor

jeskew commented Sep 17, 2015

The S3 client does implement exponential delay for request retries. The retries option means "retry this request up to X times," so increasing it from 11 to 3 is unlikely to be related to the increased number of errors you're seeing; anything that would fail with 11 retries would fail more quickly with 3.

The one thing I can think of is that Guzzle 6 only adds an 'Expect: 100-continue' header to requests with payloads greater than 1 MB. Are the failing payloads all smaller than 1 MB?

@grapevine2383
Copy link
Author

My failing payloads are mostly less than 1 MB but can occasionally go over to a max of 2MB.

@jeskew
Copy link
Contributor

jeskew commented Sep 18, 2015

@jimmaay You might want to try adding the expect header yourself (check out the docs page on mapRequest middleware if you're unsure how to do so).

But since you're seeing this error with requests on which that header has already been set, I think the safest route would be for you to aggressively retry uploads that fail.

@grapevine2383
Copy link
Author

What is the best way to aggressively retry uploads? I am fine if the upload takes a minute even. Is there an option to increase the exponential backoff timing without changing the library?

@jeskew
Copy link
Contributor

jeskew commented Sep 18, 2015

The backoff as implemented will increase rapidly with each retry. Increasing the number of retries to a value you feel comfortable with is the easiest way, though you can wrap the upload in a try/catch loop in your code, too, if you'd prefer.

@grapevine2383
Copy link
Author

I'll try increasing retries to 50. But after I reset retries back to default yesterday the error rate came back to normal. I think retrying aggressively may put more load on the the S3 servers and in return give me more errors?

@grapevine2383
Copy link
Author

You said it was up to x number of retries, this just means that if it's successful then it won't retry or are there other breaks on retries?

@jeskew
Copy link
Contributor

jeskew commented Sep 18, 2015

If you get a successful response, then the request will not be retried. But you're right -- retrying those failures will put more strain on your hardware, which can be problematic if the root of your issue is contention on your end of the network. (It will also marginally increase the load on S3, but I think those servers can handle it. :) )

@grapevine2383
Copy link
Author

When I upped the retries to 11 Amazon's servers returned 500 (We encountered an internal error. Please try again.) errors as responses, so it's for sure not my servers in that case.

@jeskew
Copy link
Contributor

jeskew commented Sep 18, 2015

I think you'd get farther talking to someone at S3. Have you tried reaching out to AWS support or asking a question on the S3 forum? AWS support would have a lot more insight into your account.

@grapevine2383
Copy link
Author

Okay I will bring it up on the S3 forums and their startup support. I was hoping there would be a simple programmatic solution as it's only 1-10 errors a day out of thousands of requests. The retries thing seemed like it would work. Would building a replicatable PHP test case for developers to test help with this? It seems to be a problem either with not enough backoff or something related to retries. A simple case would be to create a script that uploads 100-200 random sized text blocks between 100kb-200kb per second, I'm sure there will be 500 errors.

@jeskew
Copy link
Contributor

jeskew commented Sep 18, 2015

500 errors do not originate from the SDK, which is why I think S3 support would be able to give you a more comprehensive answer. FWIW, the exponential backoff algorithm in the SDK was provided by the S3 team, so I presume it's in line with what they want. From what I've read on the S3 forum, 500 errors are not unusual.

@grapevine2383
Copy link
Author

Yes I read it is not unusual (unfortunately) to get 500 errors from S3. I just wish the SDK's exponential back-off retries would stop any errors like they said it would.

jeskew added a commit to jeskew/aws-sdk-php that referenced this issue Sep 19, 2015
jeskew added a commit to jeskew/aws-sdk-php that referenced this issue Sep 19, 2015
@jeskew
Copy link
Contributor

jeskew commented Sep 19, 2015

Looks like the connection reset errors would not be retried with all versions of Guzzle. I added an additional check for it in the S3 retry handler in 2f985ba.

@grapevine2383
Copy link
Author

This seems like it will fix it. I will implement and let you know how it goes. Thanks for the help.

jeskew added a commit to jeskew/aws-sdk-php that referenced this issue Sep 19, 2015
@grapevine2383
Copy link
Author

That worked. No more errors. Thanks again for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants