-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29190 Allow disabling regions in FAILED_OPEN state #6797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| am.getRegionStates().removeFromFailedOpen(regionNode.getRegionInfo()); | ||
| future = am.getRegionStateStore().updateRegionLocation(regionNode); | ||
| FutureUtils.addListener(future, (r, e) -> { | ||
| if (e != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failed persisting will lead to master crash I believe, so we do not need to deal with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, thanks for the clarification. Actually I was unable to test this part, but I was unsure if it's needed so just I added it. Let me remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // FAILED_OPEN doesn't need further transition, immediately mark the region as closed | ||
| AssignmentManager am = env.getAssignmentManager(); | ||
| am.getRegionStates().removeFromFailedOpen(regionNode.getRegionInfo()); | ||
| future = am.getRegionStateStore().updateRegionLocation(regionNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to set the state to CLOSED when the state is FAILED_OPEN? And if the state is CLOSED, do we still need to call udpateRegionLocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. Before I answer the questions, please take a look at dd8716f. I added a few more steps so that the intention of this change is clearer.
We do not need to set the state to CLOSED when the state is FAILED_OPEN?
So the question is: "Is regionNode.setState(State.CLOSED, State.FAILED_OPEN) really necessary? Can we just check regionNode.isInState(State.FAILED_OPEN) here?" Am I right?
If we don't change the state,
diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
index 6b5aaf6975..215e1245ed 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
@@ -404,11 +404,13 @@ public class TransitRegionStateProcedure
if (regionNode.isInState(STATES_EXPECTED_ON_CLOSING)) {
// This is the normal case
future = env.getAssignmentManager().regionClosing(regionNode);
- } else if (regionNode.setState(State.CLOSED, State.FAILED_OPEN)) {
- // FAILED_OPEN doesn't need further transition, immediately mark the region as closed
+ } else if (regionNode.isInState(State.FAILED_OPEN)) {
+ // Remove the region from RIT list to suppress periodic "RITs over threshold" messages
AssignmentManager am = env.getAssignmentManager();
am.getRegionStates().removeFromFailedOpen(regionNode.getRegionInfo());
- future = am.getRegionStateStore().updateRegionLocation(regionNode);
+
+ // FAILED_OPEN doesn't need further transition
+ future = CompletableFuture.allOf();
}
if (future != null) {
ProcedureFutureUtil.suspendIfNecessary(this, this::setFuture, future, env,we will run into an assertion error in confirmClosed:
Lines 423 to 449 in cab6eb4
| private Flow confirmClosed(MasterProcedureEnv env, RegionStateNode regionNode) | |
| throws IOException { | |
| if (regionNode.isInState(State.CLOSED)) { | |
| retryCounter = null; | |
| if (lastState == RegionStateTransitionState.REGION_STATE_TRANSITION_CONFIRM_CLOSED) { | |
| // we are the last state, finish | |
| regionNode.unsetProcedure(this); | |
| return Flow.NO_MORE_STATE; | |
| } | |
| // This means we need to open the region again, should be a move or reopen | |
| setNextState(RegionStateTransitionState.REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE); | |
| return Flow.HAS_MORE_STATE; | |
| } | |
| if (regionNode.isInState(State.CLOSING)) { | |
| // This is possible, think the target RS crashes and restarts immediately, the close region | |
| // operation will return a NotServingRegionException soon, we can only recover after SCP takes | |
| // care of this RS. So here we throw an IOException to let upper layer to retry with backoff. | |
| setNextState(RegionStateTransitionState.REGION_STATE_TRANSITION_CLOSE); | |
| throw new HBaseIOException("Failed to close region"); | |
| } | |
| // abnormally closed, need to reopen it, no matter what is the last state, see the comment in | |
| // confirmOpened for more details that why we need to reopen the region first even if we just | |
| // want to close it. | |
| // The only exception is for non-default replica, where we do not need to deal with recovered | |
| // edits. Notice that the region will remain in ABNORMALLY_CLOSED state, the upper layer need to | |
| // deal with this state. For non-default replica, this is usually the same with CLOSED. | |
| assert regionNode.isInState(State.ABNORMALLY_CLOSED); |
Because FAILED_OPEN is not covered in the conditions. If we add FAILED_OPEN here like so:
diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
index 6b5aaf6975..7cf0941f20 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
@@ -422,7 +424,7 @@ public class TransitRegionStateProcedure
private Flow confirmClosed(MasterProcedureEnv env, RegionStateNode regionNode)
throws IOException {
- if (regionNode.isInState(State.CLOSED)) {
+ if (regionNode.isInState(State.CLOSED, State.FAILED_OPEN)) {
retryCounter = null;
if (lastState == RegionStateTransitionState.REGION_STATE_TRANSITION_CONFIRM_CLOSED) {
// we are the last state, finishWe can avoid the assertion error, but CloseTableRegionsProcedure will not finish and endlessly retry:
There are still 1 unclosed region(s) for closing regions of table testDisableFailedOpenRegions when executing CloseTableRegionsProcedure, continue...
So we also need to change the implementation of numberOfUnclosedRegions to account for FAILED_OPEN regions.
hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Lines 1147 to 1163 in 42715b6
| private int numberOfUnclosedRegions(TableName tableName, | |
| Function<RegionStateNode, Boolean> shouldSubmit) { | |
| int unclosed = 0; | |
| for (RegionStateNode regionNode : regionStates.getTableRegionStateNodes(tableName)) { | |
| regionNode.lock(); | |
| try { | |
| if (shouldSubmit.apply(regionNode)) { | |
| if (!regionNode.isInState(State.OFFLINE, State.CLOSED, State.SPLIT)) { | |
| unclosed++; | |
| } | |
| } | |
| } finally { | |
| regionNode.unlock(); | |
| } | |
| } | |
| return unclosed; | |
| } |
But I felt this was getting too complicated, and it's simpler to just change the state to CLOSED.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if the state is CLOSED, do we still need to call udpateRegionLocation?
I called the method to persist the in-memory change of regionNode to the meta table. Would it be clearer if I call persistToMeta instead? The result is roughly the same.
diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
index 6b5aaf6975..69c9d1ffca 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
@@ -408,7 +408,7 @@ public class TransitRegionStateProcedure
// FAILED_OPEN doesn't need further transition, immediately mark the region as closed
AssignmentManager am = env.getAssignmentManager();
am.getRegionStates().removeFromFailedOpen(regionNode.getRegionInfo());
- future = am.getRegionStateStore().updateRegionLocation(regionNode);
+ future = am.persistToMeta(regionNode);
}
if (future != null) {
ProcedureFutureUtil.suspendIfNecessary(this, this::setFuture, future, env,If the question is about if it's necessary to persist the changed state to meta, it looks like it's not necessary in this case, though we have to change an assertion in the test code.
diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
index 6b5aaf6975..6b7989dd58 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
@@ -408,7 +408,7 @@ public class TransitRegionStateProcedure
// FAILED_OPEN doesn't need further transition, immediately mark the region as closed
AssignmentManager am = env.getAssignmentManager();
am.getRegionStates().removeFromFailedOpen(regionNode.getRegionInfo());
- future = am.getRegionStateStore().updateRegionLocation(regionNode);
+ future = CompletableFuture.allOf();
}
if (future != null) {
ProcedureFutureUtil.suspendIfNecessary(this, this::setFuture, future, env,
diff --git a/hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestTransitRegionStateProcedure.java b/hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestTransitRegionStateProcedure.java
index 074e7e730c..392993280b 100644
--- a/hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestTransitRegionStateProcedure.java
+++ b/hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestTransitRegionStateProcedure.java
@@ -250,8 +250,8 @@ public class TestTransitRegionStateProcedure {
// The number of RITs should be 0 after disabling the table
assertEquals(0, getTotalRITs());
- // The regions are now in "CLOSED" state
- assertEquals(Collections.singleton("CLOSED"), getRegionStates());
+ // The regions are still in "FAILED_OPEN" state
+ assertEquals(Collections.singleton("FAILED_OPEN"), getRegionStates());
// Fix the error in the table descriptor
tdb = TableDescriptorBuilder.newBuilder(td);However, I think we should try to keep the in-memory state and persistent meta state synchronized to avoid confusion and any potential issues that might arise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, in the if condition you call setState, I misread the code, I thought it is isInState.
Then the logic is fine. But better to add more comments to say why here we do not need to treat CLOSED state specially.
And I suggest that we use two ProcedureFutureUtil.suspendIfNecessary calls for these two conditions, so we do not need to add extra check in closeRegionAfterUpdatingMeta, since for the region in FAILED_OPEN state, the only action after updating meta is to change the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I suggest that we use two ProcedureFutureUtil.suspendIfNecessary calls for these two conditions, so we do not need to add extra check in closeRegionAfterUpdatingMeta
Ahh.. I remembered why I organized the code this way. This was not super obvious.
When you disable the table with "FAILED_OPEN" regions,
DisableTableProcedurecreates aCloseTableRegionsProcedure,- It results in
TRSPwithREGION_STATE_TRANSITION_CLOSEas the initial state
Lines 157 to 160 in cab6eb4
case UNASSIGN: initialState = RegionStateTransitionState.REGION_STATE_TRANSITION_CLOSE; lastState = RegionStateTransitionState.REGION_STATE_TRANSITION_CONFIRM_CLOSED; break; - Because of the initial condition, it calls
TRSP#closeRegion,
Lines 516 to 518 in cab6eb4
case REGION_STATE_TRANSITION_CLOSE: closeRegion(env, regionNode); return Flow.HAS_MORE_STATE; - which in turn calls
closeRegionAfterUpdatingMeta
Lines 394 to 401 in cab6eb4
private void closeRegion(MasterProcedureEnv env, RegionStateNode regionNode) throws IOException, ProcedureSuspendedException { if ( ProcedureFutureUtil.checkFuture(this, this::getFuture, this::setFuture, () -> closeRegionAfterUpdatingMeta(env, regionNode)) ) { return; } - So we're entering
closeRegionAfterUpdatingMetafor aFAILED_OPENregion and HBase creates aCloseRegionProcedurewhich is exactly what we tried to avoid. CloseRegionProcedureis a subclass ofRegionRemoteProcedureBasewhich requirestargetServer, but aFAILED_OPENregion lacks it, and the procedure fails and hangs.2025-05-23T19:17:19,176 WARN [PEWorker-1 {}] procedure2.ProcedureExecutor$WorkerThread(2184): Worker terminating UNNATURALLY null java.lang.NullPointerException: null at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProcedureProtos$RegionRemoteProcedureBaseStateData$Builder.setTargetServer(MasterProcedureProtos.java:54339) ~[classes/:?] at org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.serializeStateData(RegionRemoteProcedureBase.java:382) ~[classes/:?] at org.apache.hadoop.hbase.master.assignment.CloseRegionProcedure.serializeStateData(CloseRegionProcedure.java:74) ~[classes/:?]
So that explains why I had to put the check at the start of closeRegionAfterUpdatingMeta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it looks like my analysis was not entirely correct, because at step 4, the initial call to checkFuture { closeRegionAfterUpdatingMeta } will not actually trigger closeRegionAfterUpdatingMeta. But the suspension of the procedure in suspendIfNecessary is what triggers multiple closeRegion calls which leads to a call to closeRegionAfterUpdatingMeta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I suggest that we use two ProcedureFutureUtil.suspendIfNecessary calls for these two conditions,
Addressed that in 718156e. I also added more comments.
so we do not need to add extra check in closeRegionAfterUpdatingMeta
The extra check is still there for the reason mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed that in 718156e. I also added more comments.
Apologies for the noise. I reverted the change after realizing that ProcedureFutureUtil.suspendIfNecessary does not guarantee execution of actionAfterDone when its future is suspended. So we may get unpredictable behavior if actionAfterDone of the previous checkFuture and that of suspendIfNecessary are not identical.
- checkFuture(null) { closeRegionAfterUpdatingMeta }
- suspendIfNecessary(updateRegionLocation) { setNextState }
- suspended
- closeRegion called again
- checkFuture(updateRegionLocation) { closeRegionAfterUpdatingMeta }
- closeRegionAfterUpdatingMeta
- not suspended
- setNextState
- suspended
So with the latest commit, closeRegionAfterUpdatingMeta serves as the only terminal point of the procedure, regardless of the previous state of the region or whether the future was suspended or not.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
No description provided.