Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataloss issue with this lambda redshift loader #230

Open
rejimdt opened this issue Jun 9, 2021 · 7 comments
Open

Dataloss issue with this lambda redshift loader #230

rejimdt opened this issue Jun 9, 2021 · 7 comments

Comments

@rejimdt
Copy link

rejimdt commented Jun 9, 2021

Hi,
We have faced data loss issue with this setup and had to go with a different approach. Looking forward to get the issue fixed.

We have done all the set up and was working fine with relatively less data. We used the reprocessBatch approach for firing the copy command for failed files. It worked well in lower environment . But in production, with the large volume of data we faced dataloss of 850 million record out of 23 billion records. We had made sure that the failed files are reprocessed.
We didn't get any error also.

Thanks

@IanMeyers
Copy link
Contributor

It is highly likely that your lambda functions were timing out during a long-running COPY operation. Can you please confirm what timeout you were setting? Recommended timeout is 15 minutes. If the COPY command will take longer than 15 minutes, then you either need to load more frequently, or parallelise data loads at the table level.

@rejimdt
Copy link
Author

rejimdt commented Jun 9, 2021 via email

@IanMeyers
Copy link
Contributor

And what is the batch load size, and timeout, and how many files are submitted during the timeout interval please?

@rejimdt
Copy link
Author

rejimdt commented Jun 9, 2021 via email

@IanMeyers
Copy link
Contributor

And how long does the cluster take to load a single file?

@rejimdt
Copy link
Author

rejimdt commented Jun 10, 2021 via email

@IanMeyers
Copy link
Contributor

OK - so in theory you can go to a batch size of ~3, but would be good to test the runtime of that.

Can you please access the lambda logs for a failed load and paste them here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants