-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22942 move Snapshot verification to procedure #662
Conversation
Hi @busbey , I have moved the SnapshotReferenceUtil.verifySnapshot() from SnapshotManager to prepareRestore() step of RestoreSnapshotProcedure. Please review the PR and let me know if this is what was intended? |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to the procedure if the verify fails? Will the Procedure close out cleanly? Or will we it be perpetually rescheduled? Thanks.
Hi @saintstack ,If the procedure fails as a result of verifySnapshot, it will throw corruptedSnapshotException or IOException which will be taken care as a part of the procedure framework. The procedure will close out cleanly. |
Please check on the unit test failures since they're supposed to cover the code that changed. |
@@ -351,6 +352,12 @@ private void prepareRestore(final MasterProcedureEnv env) throws IOException { | |||
mfs.getFileSystem(), | |||
SnapshotDescriptionUtils.getCompletedSnapshotDir(snapshot, mfs.getRootDir()), | |||
snapshot); | |||
|
|||
// Verify snapshot validity | |||
SnapshotReferenceUtil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently in a block for only checking non-system tables. Needs to happen for all tables.
Also should probably be the first thing we do in prepareRestore, since it used to proceed the step.
Please update the status monitor about doing the verification as well.
@sanjeetnishad95 See @busbey comments above? Any chance of addressing them? Thanks. |
yes @saintstack , I will address them ASAP. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Closing. No progress. Make a new PR if I have this wrong. Thanks. |
Problem : Right now we do snapshot verification prior to queueing the restore / clone request from the client. That means the initial call from the client has to block until we're done. On large manifests (~single digit millions) this easily takes longer than the default timeout of 20 minutes.
Solution: Instead we should handle verification as one of the steps in the relevant procedure (hbase 2+) or table handler (hbase 1) so that the master can do it in the background and report status.