-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28697 Don't clean bulk load system entries until backup is complete #6089
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems alright to me but I'd appreciate to hear from another voice with familiarity in this code path. Specifically, are there semantic implications in other parts of the backup system for having the completeBackup
called before the deletes occur?
*/ | ||
@SuppressWarnings("unchecked") | ||
protected Map<byte[], List<Path>>[] handleBulkLoad(List<TableName> sTableList) | ||
throws IOException { | ||
protected List<byte[]> handleBulkLoad(List<TableName> sTableList) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it okay to drop the table context of the rowkeys in the returned value? a rowkey is only meaningful in the context of its table (or region).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this a bit offline. We can purge these rowkeys because they are only returned by handleBulkload
if we have bulk loaded the keys in this backup.
Right now an inopportune failure would result in us missing bulk load data on subsequent incremental backups, but with this change an inopportune failure would result is us backing up duplicative files which should be just a little bit wasteful, but otherwise innocuous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth having some backup consistency check that can detect and purge extra files? Or do we think that backups will cycle out and the redundancy will be dropped the next time a full backup is taken?
@DieterDP-ng any thoughts on this PR? |
*/ | ||
@SuppressWarnings("unchecked") | ||
protected Map<byte[], List<Path>>[] handleBulkLoad(List<TableName> sTableList) | ||
throws IOException { | ||
protected List<byte[]> handleBulkLoad(List<TableName> sTableList) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth having some backup consistency check that can detect and purge extra files? Or do we think that backups will cycle out and the redundancy will be dropped the next time a full backup is taken?
…lete (apache#6089) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…lete (apache#6089) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…lete (apache#6089) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…lete (#6089) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…lete (#6089) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…lete (#6089) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Sorry for the late reply. These changes look OK to me.
I'm not aware of any. |
https://issues.apache.org/jira/browse/HBASE-28697
I've been thinking through the incremental backup order of operations, and I think we delete rows from the bulk loads system table too early and, consequently, make it possible to produce a "successful" incremental backup that is missing bulk loads.
To summarize the steps here, starting in
IncrementalTableBackupCilent#execute
:We could consider this issue an extension or replacement of https://issues.apache.org/jira/browse/HBASE-28084 in some ways, depending on what solution we land on. I think that we could fix this specific issue by reordering the bulk load table cleanup, but there will always be gotchas like this. Maybe it is simpler to require that the next backup be a full backup after any incremental failure.
cc @hgromer @ndimiduk @DieterDP-ng