-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECONNRESET exceptions when running in Lambda environment #1196
ECONNRESET exceptions when running in Lambda environment #1196
Comments
Having similar issue using DynamoDB client. Using:
|
Info Lambda import { DynamoDBClient, DescribeTableCommand } from "@aws-sdk/client-dynamodb"
const dynamo = new DynamoDBClient({})
export const tempDebug = async (): Promise<object> => {
const res = await dynamo.send(new DescribeTableCommand({
TableName: '<TableName>'
}))
return Promise.resolve(res.Table)
} Local import { LambdaClient, InvokeCommand } from "@aws-sdk/client-lambda"
declare const TextDecoder
const lambda = new LambdaClient({})
;(async () => {
let counter = 0
// eslint-disable-next-line no-constant-condition
while (true) {
console.log(counter)
counter++
const res = await lambda.send(new InvokeCommand({
FunctionName: '<FunctionName>'
}))
const obj = JSON.parse(new TextDecoder("utf-8").decode(res.Payload))
if (obj.errorType === 'Error') {
console.log(obj)
break
}
//await new Promise(resolve => setTimeout(resolve, 5 * 60 * 1000))
await new Promise(resolve => setTimeout(resolve, 90 * 1000))
}
})() Produces the following errors consistently when run with {
"errorType": "Error",
"errorMessage": "write EPIPE",
"code": "EPIPE",
"errno": "EPIPE",
"syscall": "write",
"$metadata": {
"retries": 0,
"totalRetryDelay": 0
},
"stack": [
"Error: write EPIPE",
" at WriteWrap.onWriteComplete [as oncomplete] (internal/stream_base_commons.js:92:16)",
" at writevGeneric (internal/stream_base_commons.js:132:26)",
" at TLSSocket.Socket._writeGeneric (net.js:782:11)",
" at TLSSocket.Socket._writev (net.js:791:8)",
" at doWrite (_stream_writable.js:401:12)",
" at clearBuffer (_stream_writable.js:519:5)",
" at TLSSocket.Writable.uncork (_stream_writable.js:338:7)",
" at ClientRequest._flushOutput (_http_outgoing.js:862:10)",
" at ClientRequest._flush (_http_outgoing.js:831:22)",
" at _http_client.js:315:47"
]
} {
"errorType": "Error",
"errorMessage": "socket hang up",
"code": "ECONNRESET",
"$metadata": {
"retries": 0,
"totalRetryDelay": 0
},
"stack": [
"Error: socket hang up",
" at connResetException (internal/errors.js:608:14)",
" at TLSSocket.socketOnEnd (_http_client.js:453:23)",
" at TLSSocket.emit (events.js:322:22)",
" at endReadableNT (_stream_readable.js:1187:12)",
" at processTicksAndRejections (internal/process/task_queues.js:84:21)"
]
} Works as expected when run with 1 minute intervals. |
Edit: Better workaround #1196 (comment) I also changed I think I found a workaround. Client needs to call destroy method.
import { DynamoDBClient, DescribeTableCommand } from "@aws-sdk/client-dynamodb"
const dynamo = new DynamoDBClient({})
export const tempDebug = async (): Promise<object> => {
const res = await dynamo.send(new DescribeTableCommand({
TableName: '<TableName>'
}))
dynamo.destroy()
return Promise.resolve(res.Table)
} I have not been able to reproduce the issue when run with ts-node v.8.8.1 or with a docker image(node 12.17.0). |
@samirda thanks for that. However, for highly trafficked lambdas this will eliminate the benefits of keep-alive. Either way, ECONNRESET is a retry-able error and should be handled that way. |
@monken I agree but the workaround should be more error safe until a proper fix is in place. |
I've managed to work around this using this configuration (updated for gamma): import {
StandardRetryStrategy,
defaultRetryDecider,
} from '@aws-sdk/middleware-retry';
import { SdkError } from '@aws-sdk/smithy-client';
const retryDecider = (err: SdkError & { code?: string }) => {
if (
'code' in err &&
(err.code === 'ECONNRESET' ||
err.code === 'EPIPE' ||
err.code === 'ETIMEDOUT')
) {
return true;
} else {
return defaultRetryDecider(err);
}
};
// eslint-disable-next-line @typescript-eslint/require-await
const retryStrategy = new StandardRetryStrategy(async () => '3', {
retryDecider,
});
export const defaultClientConfig = {
maxRetries: 3,
retryStrategy,
}; It would be nice if this was built-in to |
@studds Thanks for the elegant solution. This is working perfect for me now. |
Is something new or plans to fix this? Every second/third call of lambda with dynamoDB client in peak end with one of these errors
So I'm unable to use sdk v3 as replace for old sdk |
clients in 1.0.0-gamma.3 now retry in case of Transient Errors
It doesn't check for ECONNRESET, ETIMEDOUT or EPIPE though |
Got the same problem using |
I confirm, it still happens with |
I think this has been resolved in |
There is no mention of a fix in the release notes: https://github.com/aws/aws-sdk-js-v3/releases/tag/v1.0.0-gamma.6 |
I know but I have not been able to reproduce the errors. |
Issues are still happening in |
me too, still errors with |
I was to quick, errors still happening in |
Thers |
iam testing |
|
Has anyone from AWS or a maintainer even commented on this issue? |
@AllanFly120 mind to chime in? |
I'm also seeing this regulary. I increased maxAttemps without any significant improvement.
|
So we are at release candidate 4 and this problem has not even been acknowledged 😞 |
Using the fix of TS implementation:
the Hoping for an actual fix in the next RC |
Documentation of the errors discussed in this thread:
Node.js docs: https://nodejs.org/api/errors.html#errors_common_system_errors |
Findings with the example code given in #1196 (comment):
The smithy-client doesn't re-throw error: aws-sdk-js-v3/packages/smithy-client/src/client.ts Lines 54 to 56 in 41e9359
Question to answer: When requestHandler throws an error, why deserialize step is not called? |
The deserializer is not called as next function in deserializerMiddleware throws: aws-sdk-js-v3/packages/middleware-serde/src/deserializerMiddleware.ts Lines 19 to 23 in 41e9359
As explained in the hack in #1196 (comment), this can be caught and retried in retryMiddleware as it has try/catch aws-sdk-js-v3/packages/middleware-retry/src/defaultStrategy.ts Lines 125 to 147 in 41e9359
Question: Should deserializer be called in case next function throws error? |
Just wondering (but nice to see the issue is being looked at): are you saying this is "just" an error that's not properly handled? Because it seems to me like this is hiding the issue under the rug, and retrying when there is a failure after a (long) timeout: there is no reason for the request to fail, as there is no throttling or anything happening. I can easily reproduce the bug by calling the code once every 10-ish minutes: the lambda process will still be around, there is no other load or anything in parallel, but the previous request was not properly "cleaned up". Again - this has been mentioned before - simply swapping the new v3 request with the old aws-sdk one works right away without any issue. We migrated a ton of services to use the v3 SDK and can't merge that branch to production because of this. |
Hi @rraziel, I'm currently looking into how JS SDK v2 handles this and will provide a fix in v3 accordingly.
The current behavior in undesirable, and the SDK should retry the error instead of asking user to do it. |
The fix needs to be in node-http-handler, where it detects ECONNRESET/EPIPE/ETIMEDOUT errors and reject them with name Example TimeoutError for socket timeout:
|
This issue is fixed in #1693, and will be published in rc.7 on Thursday 11/19 |
Update: The https://github.com/aws/aws-sdk-js-v3/releases/tag/v1.0.0-rc.7 is complete. Verified that this issue is fixed using example code given in #1196 (comment) |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread. |
Describe the bug
We have this very basic lambda function that reads the file from S3 when a new file is uploaded (we actually consume the Body stream too, left that out for brevity). The function is called intermittently meaning that sometimes we get a new Lambda function (i.e. cold) sometimes the Lambda container is reused. When the container is reused, we sometimes see a
ECONNRESET
exception such as this oneI'm pretty confident that this is due to the keep-alive nature of the https connection. Lambda processes are frozen after they execute and their host seems to terminate open sockets after ~10 minutes. The next time the S3 client tries to reuse the socket, the exception is thrown.
We are running into similar issues with connections to our Aurora database which also terminates intermittently with the same error message (see brianc/node-postgres#2112). It's an error we can easily recover from if we try to reopen the socket but aws-sdk-v3 seems to prefer to throw an error message instead.
Is the issue in the browser/Node.js?
Node.js 12.x on AWS Lambda
The text was updated successfully, but these errors were encountered: