-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataloss issue with this lambda redshift loader #230
Comments
It is highly likely that your lambda functions were timing out during a long-running COPY operation. Can you please confirm what timeout you were setting? Recommended timeout is 15 minutes. If the COPY command will take longer than 15 minutes, then you either need to load more frequently, or parallelise data loads at the table level. |
We had set the time out as 15 minutes. There were upto 80 ,( 800) MB files (parquet compressed) created at a point of time loaded using glue batches as input. We experienced the issue during this time mainly.
From: Ian Meyers ***@***.***>
Date: Wednesday, June 9, 2021 at 1:11 AM
To: awslabs/aws-lambda-redshift-loader ***@***.***>
Cc: Priyadarsana, Reji ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [awslabs/aws-lambda-redshift-loader] Dataloss issue with this lambda redshift loader (#230)
It is highly likely that your lambda functions were timing out during a long-running COPY operation. Can you please confirm what timeout you were setting? Recommended timeout is 15 minutes. If the COPY command will take longer than 15 minutes, then you either need to load more frequently, or parallelise data loads at the table level.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/awslabs/aws-lambda-redshift-loader/issues/230*issuecomment-857484957__;Iw!!NFcUtLLUcw!CJiHIJsgwU0eIh3JTHw1JC1kuX7NYzJgG184Vnu6gMdo7cprdU0cZY6lsKKUK2EC$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AUNDAGKRYWRKQPXHDZZRJE3TR4OYTANCNFSM46LIDYFQ__;!!NFcUtLLUcw!CJiHIJsgwU0eIh3JTHw1JC1kuX7NYzJgG184Vnu6gMdo7cprdU0cZY6lsMyRXNqu$>.
[CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email is proprietary to Medtronic and is intended for use only by the individual or entity to which it is addressed, and may contain information that is private, privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient or it appears that this mail has been forwarded to you without proper authority, you are notified that any use or dissemination of this information in any manner is strictly prohibited. In such cases, please delete this mail from your records. To view this notice in other languages you can either select the following link or manually copy and paste the link into the address bar of a web browser: http://emaildisclaimer.medtronic.com
|
And what is the batch load size, and timeout, and how many files are submitted during the timeout interval please? |
batchTimeoutSecs = 60
batchSize=1
dataFormat=PARQUET
batchSizeBytes=10485760
Glue job- processing hourly data every hour .
Files – max 50 files of 800MB each
From: Ian Meyers ***@***.***>
Date: Wednesday, June 9, 2021 at 8:05 AM
To: awslabs/aws-lambda-redshift-loader ***@***.***>
Cc: Priyadarsana, Reji ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [awslabs/aws-lambda-redshift-loader] Dataloss issue with this lambda redshift loader (#230)
And what is the batch load size, and timeout, and how many files are submitted during the timeout interval please?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/awslabs/aws-lambda-redshift-loader/issues/230*issuecomment-857776733__;Iw!!NFcUtLLUcw!Exhpg_-4-kwEj0liB78QxHRvrHdQtcJAyFG6iNGhsIJkwUl8oho3q4xDKXY-bc9K$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AUNDAGLEE7LA22RE74T4D4TTR57MLANCNFSM46LIDYFQ__;!!NFcUtLLUcw!Exhpg_-4-kwEj0liB78QxHRvrHdQtcJAyFG6iNGhsIJkwUl8oho3q4xDKSMEDIR7$>.
[CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email is proprietary to Medtronic and is intended for use only by the individual or entity to which it is addressed, and may contain information that is private, privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient or it appears that this mail has been forwarded to you without proper authority, you are notified that any use or dissemination of this information in any manner is strictly prohibited. In such cases, please delete this mail from your records. To view this notice in other languages you can either select the following link or manually copy and paste the link into the address bar of a web browser: http://emaildisclaimer.medtronic.com
|
And how long does the cluster take to load a single file? |
On an average the time taken is 4 minutes + , and some of them in the waiting as many files are pushed at the same time.
From: Ian Meyers ***@***.***>
Date: Thursday, June 10, 2021 at 12:26 AM
To: awslabs/aws-lambda-redshift-loader ***@***.***>
Cc: Priyadarsana, Reji ***@***.***>, Author ***@***.***>
Subject: [EXTERNAL] Re: [awslabs/aws-lambda-redshift-loader] Dataloss issue with this lambda redshift loader (#230)
And how long does the cluster take to load a single file?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/awslabs/aws-lambda-redshift-loader/issues/230*issuecomment-858382324__;Iw!!NFcUtLLUcw!AkqfPjlTle33cDvmPi9WQZjFO3hdnsm3hGlrZOE1ETfbKahTi6CnwDeXyhiJzGJx$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AUNDAGKQQ6TKBTSPI6E5VXDTSBSIPANCNFSM46LIDYFQ__;!!NFcUtLLUcw!AkqfPjlTle33cDvmPi9WQZjFO3hdnsm3hGlrZOE1ETfbKahTi6CnwDeXyvwZoskh$>.
[CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email is proprietary to Medtronic and is intended for use only by the individual or entity to which it is addressed, and may contain information that is private, privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient or it appears that this mail has been forwarded to you without proper authority, you are notified that any use or dissemination of this information in any manner is strictly prohibited. In such cases, please delete this mail from your records. To view this notice in other languages you can either select the following link or manually copy and paste the link into the address bar of a web browser: http://emaildisclaimer.medtronic.com
|
OK - so in theory you can go to a batch size of ~3, but would be good to test the runtime of that. Can you please access the lambda logs for a failed load and paste them here? |
Hi,
We have faced data loss issue with this setup and had to go with a different approach. Looking forward to get the issue fixed.
We have done all the set up and was working fine with relatively less data. We used the reprocessBatch approach for firing the copy command for failed files. It worked well in lower environment . But in production, with the large volume of data we faced dataloss of 850 million record out of 23 billion records. We had made sure that the failed files are reprocessed.
We didn't get any error also.
Thanks
The text was updated successfully, but these errors were encountered: