-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create retention leases file during recovery #39359
Create retention leases file during recovery #39359
Conversation
Today we recover shard history retention leases from disk whenever opening the engine. However in many cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases. Moreover when recovering a replica during peer recovery there is no need to restore the retention leases since they will be overwritten by the primary soon after. In fact the only time we want to recover the shard history retention leases from disk is if the recovery source is `RecoverySource.ExistingStoreRecoverySource.INSTANCE`. This change does that.
Pinging @elastic/es-distributed |
@elasticmachine please run elasticsearch-ci/docbldesx |
@@ -1433,7 +1433,12 @@ private void innerOpenEngineAndTranslog() throws IOException { | |||
final String translogUUID = store.readLastCommittedSegmentsInfo().getUserData().get(Translog.TRANSLOG_UUID_KEY); | |||
final long globalCheckpoint = Translog.readGlobalCheckpoint(translogConfig.getTranslogPath(), translogUUID); | |||
replicationTracker.updateGlobalCheckpointOnReplica(globalCheckpoint, "read from translog checkpoint"); | |||
updateRetentionLeasesOnReplica(loadRetentionLeases()); | |||
// only recover retention leases if recovering from an existing store _and_ not bootstrapping a new history UUID. | |||
if (recoveryState.getRecoverySource() == RecoverySource.ExistingStoreRecoverySource.INSTANCE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic should preferably go into StoreRecovery
, where we also do the initialization of the lucene commit and translog files. Whenever we call createEmptyTranslog
we should also delete the retention leases file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. By correlating this with createEmptyTranslog
do you mean that we should consider a missing retention leases state file as an error when loading it? Today we treat this as if it were empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should not be lenient if we prepare the lease file before opening an Engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can't be strict here yet because of BWC: indices created in 7.x do not yet have a retention lease file, and indices created in 6.x (<6.7.0) will never do so. I left the lenience in and will remove it in a followup after backporting. I included an assertion to ensure we follow up.
@elasticmachine run elasticsearch-ci/bwc |
@@ -414,6 +414,11 @@ public void cleanFiles(int totalTranslogOps, Store.MetadataSnapshot sourceMetaDa | |||
indexShard.shardPath().resolveTranslog(), SequenceNumbers.UNASSIGNED_SEQ_NO, shardId, | |||
indexShard.getPendingPrimaryTerm()); | |||
store.associateIndexWithNewTranslog(translogUUID); | |||
|
|||
assert indexShard.getRetentionLeases().leases().isEmpty() : indexShard.getRetentionLeases(); // not loaded yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peer recoveries can be retried if they fail midway through, so I wonder if this can be violated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasontedor @dnhatn just checking this PR hasn't been lost - it'd be good to hear your thoughts. |
Thanks for the ping @DaveCTurner. I’ll look in my morning. Would you mark this PR as a blocker? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates elastic#37165
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
…ate file Since elastic#39359, retention leases should always exist can never be null
`MetadataStateFormat.FORMAT.loadLatestState` can actually return null when the state directory hasn't been initialized yet, so we have to keep the null check when loading retention leases during the initialization of the engine. See #39359
`MetadataStateFormat.FORMAT.loadLatestState` can actually return null when the state directory hasn't been initialized yet, so we have to keep the null check when loading retention leases during the initialization of the engine. See elastic#39359
`MetadataStateFormat.FORMAT.loadLatestState` can actually return null when the state directory hasn't been initialized yet, so we have to keep the null check when loading retention leases during the initialization of the engine. See elastic#39359
Today we load the shard history retention leases from disk whenever opening the
engine, and treat a missing file as an empty set of leases. However in some
cases this is inappropriate: we might be restoring from a snapshot (if the
target index already exists then there may be leases on disk) or
force-allocating a stale primary, and in neither case does it make sense to
restore the retention leases from disk.
With this change we write an empty retention leases file during recovery,
except for the following cases:
During peer recovery the on-disk leases may be accurate and could be needed
if the recovery target is made into a primary.
During recovery from an existing store, as long as we are not
force-allocating a stale primary.
Relates #37165