-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Remote Store] Download segments from remote segment store in failover flow #5579
[Remote Store] Download segments from remote segment store in failover flow #5579
Conversation
This change is built on top of #5253. Once refresh level durability PR merges, we need to rebase and open the PR for review. |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
3f57b9b
to
5fb9bce
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
0ed52cf
to
4e87f36
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #5579 +/- ##
============================================
+ Coverage 70.96% 71.06% +0.09%
- Complexity 58554 58662 +108
============================================
Files 4760 4760
Lines 279515 279566 +51
Branches 40348 40357 +9
============================================
+ Hits 198363 198670 +307
+ Misses 64965 64717 -248
+ Partials 16187 16179 -8
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
e6a5e13
to
cb5f829
Compare
Gradle Check (Jenkins) Run Completed with:
|
@@ -623,6 +634,7 @@ public void updateShardState( | |||
if (indexSettings.isSegRepEnabled()) { | |||
// this Shard's engine was read only, we need to update its engine before restoring local history from xlog. | |||
assert newRouting.primary() && currentRouting.primary() == false; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : remove empty lines. Here and elsewhere.
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
65adf0f
to
08e72f6
Compare
Gradle Check (Jenkins) Run Completed with:
|
logger.info("Downloaded segments: {}", downloadedSegments); | ||
logger.info("Skipped download for segments: {}", skippedSegments); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit combine both log lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we gain by combining these log lines? Won't it impact the ability to debug?
* @param override flag to override local segment files with those in remote store | ||
* @throws IOException if exception occurs while reading segments from remote store | ||
*/ | ||
public void syncSegmentsFromRemoteSegmentStore(boolean override) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Rename the param to be more specific
if (checksum == CodecUtil.retrieveChecksum(indexInput)) { | ||
return true; | ||
} else { | ||
logger.warn("Checksum mismatch between local and remote segment file: {}, will override local file", file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could checksum mismatch be a sign of a corruption that should be flagged, which would otherwise get unnoticed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatch could be due to two things:
- The node where we are downloading files, already has a segment file with the same name but created independently of the primary. This could happen if there are two primaries creating segments.
- As you pointed out, due to corruption.
Here, we are handling # 1. We are making sure that while downloading segments, if the segment with similar name exists, we don't just assume that both are same file but compare checksums as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create a tracking issue to add checksum to metadata file and compare checksum while reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already being tracked here: #4605
logger.warn("Checksum mismatch between local and remote segment file: {}, will override local file", file); | ||
} | ||
} catch (IOException e) { | ||
logger.debug("Exception while reading checksum of file: {}, this can happen if file does not exist", file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we have the right handling here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is safe to just log the exception as we will go ahead with downloading the segment file from remote store.
Or are you talking about the log level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the log level to warn and also adding details of possible corruption in the log message
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
Description
Issues Resolved
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.