-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Transform] do not fail checkpoint creation due to global checkpoint mismatch #48423
[Transform] do not fail checkpoint creation due to global checkpoint mismatch #48423
Conversation
Pinging @elastic/ml-core (:ml/Transform) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor simplification suggestion
// it's possible that replica shards report a different/higher global checkpoints | ||
// This is by design and not a problem, take the max() for this case | ||
if (checkpoints.get(shard.getShardRouting().getId()) < globalCheckpoint) { | ||
checkpoints.put(shard.getShardRouting().getId(), globalCheckpoint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole if clause could then be changed to something similar to
checkpoints.compute(shard.getShardRouting().getId(), (shardId, cp) -> (cp == null) ? globalCheckpoint : Math.max(cp, globalCheckPoint))
ae7d3ad
to
1a2cb53
Compare
run elasticsearch-ci/1 |
Take the max if global checkpoints mismatch instead of throwing an exception. It turned out global
checkpoints can mismatch by design
fixes #48379
Severity: checkpoint creation can fail due to this issue in rare cases, however checkpoint creation is retried