This repository has been archived by the owner on Sep 6, 2023. It is now read-only.
Error during export should save last exported timestamp #109
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Two issues addressed,
Exporting large amounts of data to the lake caused a query timeout issue when sorting the records as per the row version. Sorting is important as it helps ensure that the subsequent exports restart from records which were not exported during the last run. The timeout issue was caused by this sorting over large number of records. So a new flag is introduced that allows unsorted records to be uploaded to the lake. This is to be used only as a temporary measure and should be disabled once the data has been uploaded to the lake.
An error occurring late in the export process forces the system to start from the first record when the export is invoked again. So the system has been made robust by saving the last timestamps even in case of errors. This helps subsequent exports to "catch up" from the time the last export went into the lake.