-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28687 BackupSystemTable#checkSystemTable should ensure system tables are enabled #6018
Conversation
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Hmm. Test failures could be related. Looking into it edit: they pass on my machine 😓 |
…ables are enabled
7aef574
to
f920d12
Compare
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
This same test is failing across multiple PRs: #6017 |
💔 -1 overall
This message was automatically generated. |
Okay, I can reproduce the failures if I run all backups tests locally rather than just the one. I bet this has something to do with all of the inheritance going on in these unit tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO enabling them makes sense since kicking off a backup is an implicitly request to use the backup state. It's also consistent with the other states handled here.
The only other thing I'd say is, do we need to handle exceptions? For example if we check that the table is disabled and it is, so we go to enable and it's already enabled. I've seen this problem in other areas where we handle table lifecycle automatically. Basically if multiple callers conflict, it throws an exception that could easily have been handled.
If this is already protected by the backup lock, then may not be necessary.
@Before | ||
public void ensurePreviousBackupTestsAreCleanedUp() throws Exception { | ||
// Every operation here may not be necessary for any given test, | ||
// some often being no-ops. the goal is to help ensure atomicity | ||
// of that tests that implement TestBackupBase | ||
try (BackupAdmin backupAdmin = getBackupAdmin()) { | ||
backupManager.finishBackupSession(); | ||
backupAdmin.listBackupSets().forEach(backupSet -> { | ||
try { | ||
backupAdmin.deleteBackupSet(backupSet.getName()); | ||
} catch (IOException ignored) { | ||
} | ||
}); | ||
} catch (Exception ignored) { | ||
} | ||
Arrays.stream(TEST_UTIL.getAdmin().listTableNames()) | ||
.filter(tableName -> !tableName.isSystemTable()).forEach(tableName -> { | ||
try { | ||
TEST_UTIL.truncateTable(tableName); | ||
} catch (IOException ignored) { | ||
} | ||
}); | ||
TEST_UTIL.getMiniHBaseCluster().getRegionServerThreads().forEach(rst -> { | ||
try { | ||
LogRoller walRoller = rst.getRegionServer().getWalRoller(); | ||
walRoller.requestRollAll(); | ||
walRoller.waitUntilWalRollFinished(); | ||
} catch (Exception ignored) { | ||
} | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some playing around, this seemed like a pretty good way to get TestBackupMerge
, and others, to consistently pass even when ran in bulk. Another option would be to get rid of the TestBackupMerge inheritance, but that would be a really annoying change to implement, and would probably be significantly slower in the build pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's try it.
For what it's worth, I've had success with refactoring inheritance out of these junit4 tests by moving shared logic into an ExternalResource
implementation. It doesn't let you reuse @Test
implementations, but it does make implementing before/after logic more composable.
Agreed with the feedback, I added handling of TableNotDisabled exceptions |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@Before | ||
public void ensurePreviousBackupTestsAreCleanedUp() throws Exception { | ||
// Every operation here may not be necessary for any given test, | ||
// some often being no-ops. the goal is to help ensure atomicity | ||
// of that tests that implement TestBackupBase | ||
try (BackupAdmin backupAdmin = getBackupAdmin()) { | ||
backupManager.finishBackupSession(); | ||
backupAdmin.listBackupSets().forEach(backupSet -> { | ||
try { | ||
backupAdmin.deleteBackupSet(backupSet.getName()); | ||
} catch (IOException ignored) { | ||
} | ||
}); | ||
} catch (Exception ignored) { | ||
} | ||
Arrays.stream(TEST_UTIL.getAdmin().listTableNames()) | ||
.filter(tableName -> !tableName.isSystemTable()).forEach(tableName -> { | ||
try { | ||
TEST_UTIL.truncateTable(tableName); | ||
} catch (IOException ignored) { | ||
} | ||
}); | ||
TEST_UTIL.getMiniHBaseCluster().getRegionServerThreads().forEach(rst -> { | ||
try { | ||
LogRoller walRoller = rst.getRegionServer().getWalRoller(); | ||
walRoller.requestRollAll(); | ||
walRoller.waitUntilWalRollFinished(); | ||
} catch (Exception ignored) { | ||
} | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's try it.
For what it's worth, I've had success with refactoring inheritance out of these junit4 tests by moving shared logic into an ExternalResource
implementation. It doesn't let you reuse @Test
implementations, but it does make implementing before/after logic more composable.
…ables are enabled (apache#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (apache#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (apache#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (apache#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ables are enabled (apache#6018) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…ld ensure system tables are enabled (apache#6018) (#102) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…ld ensure system tables are enabled (apache#6018) (#103) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
See https://issues.apache.org/jira/browse/HBASE-28687
If the backup system tables become disabled, then we enter a state which the backup client will not recover from. Without manual intervention, every subsequent backup attempt will fail on BackupSystemTable's calls to waitForSystemTable.
This checkSystemTable method already ensures that the tables exist — it should also ensure that the tables are enabled before we await that condition.
Alternatively, we could fast-fail if the tables are disabled rather than awaiting an enabled state that will never occur.
cc @ndimiduk @charlesconnell @hgromer