-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize sequence numbers on a shrunken index #25321
Conversation
Bringing together shards in a shrunken index means that we need to address the start of history for the shrunken index. The problem here is that sequence numbers before the maximum of the maximum sequence numbers on the source shards can collide in the target shards in the shrunken index. To address this, we set the maximum sequence number and the local checkpoint on the target shards to this maximum of the maximum sequence numbers. This enables correct document-level semantics for documents indexed before the shrink, and history on the shrunken index will effectively start from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @jasontedor
.collect(Collectors.toList()).toArray(new Directory[shards.size()])); | ||
|
||
final Directory[] sources = | ||
shards.stream().map(LocalShardSnapshot::getSnapshotDirectory).collect(Collectors.toList()).toArray(new Directory[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - there is a stream::toArray(Directory[]::new))
writer.setLiveCommitData(() -> { | ||
final HashMap<String, String> liveCommitData = new HashMap<>(2); | ||
liveCommitData.put(SequenceNumbers.MAX_SEQ_NO, Long.toString(maxSeqNo)); | ||
liveCommitData.put(SequenceNumbers.LOCAL_CHECKPOINT_KEY, Long.toString(maxSeqNo)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to do MAX_UNSAFE_AUTO_ID_TIMESTAMP_COMMIT_ID as a follow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I edited the plan on #10708 to separate these out into separate line items, I do not like mixing things. I do forgive you for not seeing this. 😛
@@ -233,7 +228,8 @@ public void testCreateShrinkIndex() { | |||
.put("number_of_shards", randomIntBetween(2, 7)) | |||
.put("index.version.created", version) | |||
).get(); | |||
for (int i = 0; i < 20; i++) { | |||
final int docs = randomIntBetween(1, 128); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to test with 0 docs too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
Bringing together shards in a shrunken index means that we need to address the start of history for the shrunken index. The problem here is that sequence numbers before the maximum of the maximum sequence numbers on the source shards can collide in the target shards in the shrunken index. To address this, we set the maximum sequence number and the local checkpoint on the target shards to this maximum of the maximum sequence numbers. This enables correct document-level semantics for documents indexed before the shrink, and history on the shrunken index will effectively start from here.
Relates #10708