-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restoring from snapshot should force generation of a new history uuid #26694
Restoring from snapshot should force generation of a new history uuid #26694
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one request for change and one question, o.w. looks good to me.
@@ -2080,10 +2080,24 @@ private DocumentMapperForType docMapper(String type) { | |||
|
|||
private EngineConfig newEngineConfig(EngineConfig.OpenMode openMode) { | |||
Sort indexSort = indexSortSupplier.get(); | |||
final boolean forceNewHistoryUUID; | |||
switch (shardRouting.recoverySource().getType()) { | |||
case EMPTY_STORE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EMPTY_STORE should force a new history uuid (see also my comment here).
EMPTY_STORE is used for two situations:
- creating a shard of a fresh index: Here, we trivially want to force a fresh UUID.
- force allocating an empty shard: Here, we are creating a fresh history, so we want to force a fresh UUID too.
Note that this leaves the case of allocating a stale primary, which should also force a fresh history. Here, we have no way at the moment though to detect that based on the recovery source object. For this, if the recovery source is EXISTING_STORE, we can compare the on-disk allocation id (ShardStateMetaData.FORMAT.load(...)
) with the expected allocation id in the shard routing object, and if they mismatch, force a new history id. Loading the ShardStateMetaData can happen in the IndexShard constructor.
I'm ok if you do this in a follow-up, but I still think EMPTY_STORE
should be correctly set here to force a new history uuid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EMPTY_STORE
will always result in a new history uuid as we create a new index writer and it will not have any user data. I'm fine with setting forceNewHistoryUUID
to true to be explicit (and later on morph this into just "generateHistoryUUID` once all existing indices are known to have one and we can be explicit about the moments we generate new ones).
Re forcing a stale allocations - good catch. As you say, that's a more complicated one indeed. I think we should do it as a follow up and discuss possible approaches. I'm thinking about a new recovery source. I think this is an edge case we should be explicit about.
assert openMode == EngineConfig.OpenMode.OPEN_INDEX_AND_TRANSLOG : "translog uuid didn't change but open mode is " + openMode; | ||
} | ||
if (needsCommit) { | ||
commitIndexWriter(indexWriter, translog, openMode == EngineConfig.OpenMode.OPEN_INDEX_CREATE_TRANSLOG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commitIndexWriter still has a check if (historyUUID != null)
. Is that one obsolete now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. fixing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…nd not delete people's stuff)
…#26694) Restoring a shard from snapshot throws the primary back in time violating assumptions and bringing the validity of global checkpoints in question. To avoid problems, we should make sure that a shard that was restored will never be the source of an ops based recovery to a shard that existed before the restore. To this end we have introduced the notion of `histroy_uuid` in #26577 and required that both source and target will have the same history to allow ops based recoveries. This PR make sure that a shard gets a new uuid after restore. As suggested by @ywelsch , I derived the creation of a `history_uuid` from the `RecoverySource` of the shard. Store recovery will only generate a uuid if it doesn't already exist (we can make this stricter when we don't need to deal with 5.x indices). Peer recovery follows the same logic (note that this is different than the approach in #26557, I went this way as it means that shards always have a history uuid after being recovered on a 6.x node and will also mean that a rolling restart is enough for old indices to step over to the new seq no model). Local shards and snapshot force the generation of a new translog uuid. Relates #10708 Closes #26544
…#26694) Restoring a shard from snapshot throws the primary back in time violating assumptions and bringing the validity of global checkpoints in question. To avoid problems, we should make sure that a shard that was restored will never be the source of an ops based recovery to a shard that existed before the restore. To this end we have introduced the notion of `histroy_uuid` in #26577 and required that both source and target will have the same history to allow ops based recoveries. This PR make sure that a shard gets a new uuid after restore. As suggested by @ywelsch , I derived the creation of a `history_uuid` from the `RecoverySource` of the shard. Store recovery will only generate a uuid if it doesn't already exist (we can make this stricter when we don't need to deal with 5.x indices). Peer recovery follows the same logic (note that this is different than the approach in #26557, I went this way as it means that shards always have a history uuid after being recovered on a 6.x node and will also mean that a rolling restart is enough for old indices to step over to the new seq no model). Local shards and snapshot force the generation of a new translog uuid. Relates #10708 Closes #26544
* master: (67 commits) Restoring from snapshot should force generation of a new history uuid (elastic#26694) test: Use a single primary shard so that the exception can caught in the same way Move pre-6.0 node checkpoint to SequenceNumbers Invalid JSON request body caused endless loop (elastic#26680) added comment fix line length violation Moved the check to fetch phase. This basically means that we throw a better error message instead of an AOBE and not adding more restrictions. inner hits: Do not allow inner hits that use _source and have a non nested object field as parent Separate Painless Whitelist Loading from the Painless Definition (elastic#26540) convert more admin requests to writeable (elastic#26566) Handle release of 5.6.1 Allow `InputStreamStreamInput` array size validation where applicable (elastic#26692) Update global checkpoint with permit after recovery Filter pre-6.0 nodes for checkpoint invariants Skip bad request REST test on pre-6.0 Reenable BWC tests after disabling for backport Add global checkpoint tracking on the primary [Test] Fix reference/cat/allocation/line_8 test failure [Docs] improved description for fs.total.available_in_bytes (elastic#26657) Fix discovery-file plugin to use custom config path ...
…roys translog recovery info (#26734) This is a bug introduced in #26694 . The issue comes from the attempt to share code that commits the new history uuid and/or a new translog uuid. This goes wrong an existing 5.6 index that is recovered from store: 1) A new history uuid is generated as it doesn't exist in the index 2) The translog is opened and it's uuid doesn't change. 3) The committing the new history uuid used the standard commitIndexWriter method. 4) The latter asks the translog for the oldest file generation that is required to recover from the local checkpoint + 1 5) The local checkpoint on old indices is -1 (until something is indexed) and the translog is asked to recover from position 0. That excludes any operations the translog that do not have a seq no assigned, causing the FullClusterRestart bwc tests to fail. To bypass this commit moves away from the attempt to share the committing code between a new translog uuid and a new history uuid. Instead we do what we did before and open the translog with a potential commit. Afterwards we commit the history uuid if needed. This comes with the expense of opening up the method to commit an index writer in the engine.
…roys translog recovery info (#26734) This is a bug introduced in #26694 . The issue comes from the attempt to share code that commits the new history uuid and/or a new translog uuid. This goes wrong an existing 5.6 index that is recovered from store: 1) A new history uuid is generated as it doesn't exist in the index 2) The translog is opened and it's uuid doesn't change. 3) The committing the new history uuid used the standard commitIndexWriter method. 4) The latter asks the translog for the oldest file generation that is required to recover from the local checkpoint + 1 5) The local checkpoint on old indices is -1 (until something is indexed) and the translog is asked to recover from position 0. That excludes any operations the translog that do not have a seq no assigned, causing the FullClusterRestart bwc tests to fail. To bypass this commit moves away from the attempt to share the committing code between a new translog uuid and a new history uuid. Instead we do what we did before and open the translog with a potential commit. Afterwards we commit the history uuid if needed. This comes with the expense of opening up the method to commit an index writer in the engine.
Restoring a shard from snapshot throws the primary back in time violating assumptions and bringing the validity of global checkpoints in question. To avoid problems, we should make sure that a shard that was restored will never be the source of an ops based recovery to a shard that existed before the restore. To this end we have introduced the notion of
histroy_uuid
in #26577 and required that both source and target will have the same history to allow ops based recoveries. This PR make sure that a shard gets a new uuid after restore.As suggested by @ywelsch , I derived the creation of a
history_uuid
from theRecoverySource
of the shard. Store recovery will only generate a uuid if it doesn't already exist (we can make this stricter when we don't need to deal with 5.x indices). Peer recovery follows the same logic (note that this is different than the approach in #26557, I went this way as it means that shards always have a history uuid after being recovered on a 6.x node and will also mean that a rolling restart is enough for old indices to step over to the new seq no model). Local shards and snapshot force the generation of a new translog uuid.Relates #10708
Closes #26544