-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore metadata of deleted indices at start #48918
Ignore metadata of deleted indices at start #48918
Conversation
Today in 6.x it is possible to add an index tombstone to the graveyard without deleting the corresponding index metadata, because the deletion is slightly deferred. If you shut down the node and upgrade to 7.x when in this state then the node will fail to apply any cluster states, reporting java.lang.IllegalStateException: Cannot delete index [...], it is still part of the cluster state. This commit addresses this situation by skipping over any index metadata with a corresponding tombstone, allowing this metadata to be cleaned up by the 7.x node.
Pinging @elastic/es-distributed (:Distributed/Cluster Coordination) |
This needs more work because it doesn't address the case where you're in this state in a rolling upgrade. |
Discussed this and decided it's ok for a rolling upgrade to fall back on a full cluster restart if it happens to be in this state, and this PR will allow the full cluster restart to proceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I've left one request to rename the function name
|
||
final MetaData metaData = internalCluster().getInstance(ClusterService.class).state().metaData(); | ||
final Path[] paths = internalCluster().getInstance(NodeEnvironment.class).nodeDataPaths(); | ||
writeBrokenMeta(metaStateService -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is not your change, but the name "writeBrokenMeta" looks invalid for two reasons - in this test we write well-formed metadata, this method performs full-cluster restart and I think this should be reflected in the method name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more substantial change to this method (including fixing its name) is incoming in https://github.com/elastic/elasticsearch/pull/48733/files#diff-a53ee618ca95b1bde55d7f5508a03d6aR511.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However the metadata written here is well-formed but still broken - it contains a tombstone for an index that is not properly deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Today in 6.x it is possible to add an index tombstone to the graveyard without deleting the corresponding index metadata, because the deletion is slightly deferred. If you shut down the node and upgrade to 7.x when in this state then the node will fail to apply any cluster states, reporting java.lang.IllegalStateException: Cannot delete index [...], it is still part of the cluster state. This commit addresses this situation by skipping over any index metadata with a corresponding tombstone, allowing this metadata to be cleaned up by the 7.x node.
Today in 6.x it is possible to add an index tombstone to the graveyard without deleting the corresponding index metadata, because the deletion is slightly deferred. If you shut down the node and upgrade to 7.x when in this state then the node will fail to apply any cluster states, reporting java.lang.IllegalStateException: Cannot delete index [...], it is still part of the cluster state. This commit addresses this situation by skipping over any index metadata with a corresponding tombstone, allowing this metadata to be cleaned up by the 7.x node.
I got this error while upgrading from 6.7 to 7.0.1. Just for other users/readers, I was able to restart my cluster by manually removing all the state with the latest lucene/luke after reading the "Cannot delete index [...], it is still part of the cluster state." message for each node start |
For the sake of other users/readers, this advice is extremely dangerous and we do not recommend following it. Editing the contents of the data path using a tool like Luke can result in arbitrary and silent data loss. @dpeddi the actual solution is not to upgrade to 7.0.1, this version is already long past EOL. The current recommendation is to upgrade to the latest 6.8 and then the latest 7.x (7.10.2 at time of writing). |
Today in 6.x it is possible to add an index tombstone to the graveyard without
deleting the corresponding index metadata, because the deletion is slightly
deferred. If you shut down the node and upgrade to 7.x when in this state then
the node will fail to apply any cluster states, reporting
This commit addresses this situation by skipping over any index metadata with a
corresponding tombstone, allowing this metadata to be cleaned up by the 7.x
node.