-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28562 Correct backup ancestor calculation #5868
HBASE-28562 Correct backup ancestor calculation #5868
Conversation
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
// The ancestors consist of the most recent FULL backups that cover the list of tables | ||
// required in the new backup and all INCREMENTAL backups that came after one of those FULL | ||
// backups. | ||
if (backup.getType().equals(BackupType.INCREMENTAL)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we're just iterating through all backup history at this point — what if the given incremental backup is not relevant to the tablesToCover
? I think that should not be included as an ancestor, and I think that's what the BackupManifest#canCoverImage
was hoping to achieve (but failing to due to the bugs explained in https://issues.apache.org/jira/browse/HBASE-28562?focusedCommentId=17843008&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17843008
Hyperbolically, on a cluster with hundreds of tables taking incremental backups everyday, this would quickly balloon into huge BackupManifests that cannot fit in a reasonably sized heap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand your point, or I'm still missing something in my understanding of the backup mechanism.
In my understanding, a newly created incremental backup will include all tables that have been included in a full backup at some point. (I base this on what is shown in hbase backup history
.) So that means that all incremental backups up to (and including) the original full backups are in fact ancestors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An incremental backup can apply to any subset of the total tables on a cluster, not necessarily all tables. So if your cluster has tables A, B, and C and you're fetching the ancestry for a table C backup, then you really don't want to include ancestors that only apply to table A and/or B
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the following experiment:
echo "create 'table1', 'cf'" | bin/hbase shell -n
echo "create 'table2', 'cf'" | bin/hbase shell -n
echo "create 'table3', 'cf'" | bin/hbase shell -n
bin/hbase backup create full file:/tmp/hbasebackups -t table1,table2
echo "put 'table1', 'row1', 'cf:a', 'valueA'" | bin/hbase shell -n
echo "put 'table2', 'row1', 'cf:a', 'valueB'" | bin/hbase shell -n
bin/hbase backup create incremental file:/tmp/hbasebackups -t table1
bin/hbase backup history
{ID=backup_1714763143839,Type=INCREMENTAL,Tables={table2,table1},State=COMPLETE,Start time=Fri May 03 21:05:44 CEST 2024,End time=Fri May 03 21:05:47 CEST 2024,Progress=100%}
{ID=backup_1714763125413,Type=FULL,Tables={table2,table1},State=COMPLETE,Start time=Fri May 03 21:05:25 CEST 2024,End time=Fri May 03 21:05:29 CEST 2024,Progress=100%}
This seems to suggest that the incremental backup includes both tables, even though I only requested table1. So I don't think it's possible to create an incremental backup of just a single table if your FULL backups have more tables covered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting... sounds like yet another bug 😄 I'm going to need to read more of the code to understand what is happening. But I believe we should be able to take incremental backups of individual tables — the API seems to allow for it, and it's a clearly useful feature when the value proposition of the entire incremental backup system is to increase the granularity of backup operations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rmdmattingly, did you verify differences when restoring those backups? If so, can you share a reproducable set of steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look today to see if I can give reproducible steps. Separately, +1 to the ancestry calculation improvements here — regardless of whether we support table specific ancestry trees, I've been playing around with this changeset vs 2.6 today and observed the several bug fixes that you've implemented here (stopping at the appropriate full backup, no more recursion issues when calculating ancestors' ancestors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote up this test which only proves your point that incremental backups ignore the tableset input — ideally, from my perspective, this test would fail at
// contains all 200 rows??? why/how/are they correct
assertEquals(rowCount, 200);
or the API wouldn't accept a set of tables to backup
As for why I'm seeing different backup contents at my day job, it's a good question and probably not as easy to reproduce in a unit test. I'm picking up a years long backup system refactor after a colleague's departure, so I probably won't have a quick answer there either, but will continue to dig.
All this to say, you're correct here and I want to reiterate the improvements I mentioned above, so I think we should ship these changes 🚢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rmdmattingly do we need to add your new unit test someplace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted with Nick this morning. I think there's room for a unit test of this genre — once we figure out exactly what the intention of the TableSet is on the backups API, then we should add a test which enforces these intentions. But this test in its current form isn't useful for that
if (tablesToCover.isEmpty()) { | ||
LOG.debug("Got {} ancestors for the current backup.", ancestors.size()); | ||
return ancestors; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 Great addition
This comment has been minimized.
This comment has been minimized.
// The ancestors consist of the most recent FULL backups that cover the list of tables | ||
// required in the new backup and all INCREMENTAL backups that came after one of those FULL | ||
// backups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A separate bug that I noticed with this changeset:
Let's say that we have the following chronological backup history F_1
, I_2
, F_3
, I_4
where the letter prefix indicates the type of backup (full or incremental) and the number indicates the timestamp at which it was taken.
With our current setup, fetching the ancestry of I_2
would return F_3
, a backup which does not precede it. Instead this method should return F_1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test illustrates the "future ancestor" problem (it fails with this updated BackupManager): https://github.com/apache/hbase/compare/master...HubSpot:hbase:backup-ancestors?expand=1#diff-21c088f224d8731ed61a0c97a72f528097ca02977cbc3195563fc996b1b66c38R205-R232
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the current implementation assumes that you can only ask the ancestors for a backup that you are newly creating. I think that was also the intention in the original implementation.
I based this on the mentioning of for the current backup
in the javadoc and logging of the method, the current usages of the method (all related to a newly created backup), and this comment in the original code:
Otherwise, this incremental backup ancestor is the dependent ancestor of the ongoing incremental backup
I agree this functionality would be useful for historic backups as well, but it's unexpected that this would only work for backups still present in the backup table (e.g. it would not work for a new HBase instance that wants to restore files from cloud storage). For retrieving the ancestors of historic backups, it makes more sense to read the manifest instead, no?
I suggest clarifying the javadoc to highlight this is only for currently-being-created backups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also relevant: in #5871 I clean up some usages of getAncestors
, perhaps it's best to make this method non-public (and keep the current implementation where it only works for currently-being-created backups).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with all of the reasoning here 👍
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test failure for TestIncrementalBackupMergeWithFailures seems related
I'll update the PR once #5871 gets merged. Not sure how to handle the rebasing of these changes? Should I do a force-push of this branch, or simply create a new PR instead? |
A rebase and force push should be good |
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated).
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated).
Move this method since it is only used by the TableBackupClient.
54174b3
to
3e8f2e9
Compare
@rmdmattingly rebased the PR after addressing the |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Please fixup the new javac warning and we're good to go.
return Collections.emptyList(); | ||
} | ||
|
||
List<BackupImage> ancestors = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the ancestors be a Set
so that we don't end up with multiple entries for the same ancestor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a need for it. Since we're iterating over a list of unique possible ancestors, we can't end up with duplicates.
Set<TableName> tablesToCover = new HashSet<>(backupInfo.getTables()); | ||
|
||
// Go over the backup history list in descending order (newest to oldest) | ||
List<BackupInfo> allHistoryList = backupManager.getBackupHistory(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this getBackupHistory()
method support filters? Could we reduce the search space for applicable backups before entering this loop?
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java
Outdated
Show resolved
Hide resolved
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java
Outdated
Show resolved
Hide resolved
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java
Outdated
Show resolved
Hide resolved
// The ancestors consist of the most recent FULL backups that cover the list of tables | ||
// required in the new backup and all INCREMENTAL backups that came after one of those FULL | ||
// backups. | ||
if (backup.getType().equals(BackupType.INCREMENTAL)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rmdmattingly do we need to add your new unit test someplace?
Requested changes applied. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Ping @bbeaudreault . I think this should be merged so we can start releasing 2.6.1? Thanks. |
Thanks for the ping @Apache9. It looks like this is ready, but I've asked @rmdmattingly and @ndimiduk to take one more look and handle committing if it's working for us internally. In terms of 2.6.1, I want to make sure we've fixed as many of the known issues as we can. I've asked @rmdmattingly to compile a list of jiras that we're working on internally. Once I have those I'll setup fixVersions/blockers in jira so that we can track progress towards release. |
In fact I'm going to send an email to the dev list so that @DieterDP-ng and his team can also list any jiras we want to try to get into that release. |
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated). Reviewed-by: Ray Mattingly <rmdmattingly@gmail.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated). Reviewed-by: Ray Mattingly <rmdmattingly@gmail.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated). Reviewed-by: Ray Mattingly <rmdmattingly@gmail.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated). Reviewed-by: Ray Mattingly <rmdmattingly@gmail.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated). Reviewed-by: Ray Mattingly <rmdmattingly@gmail.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built. This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated). Reviewed-by: Ray Mattingly <rmdmattingly@gmail.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
The ancestor calculation was wrong for incremental backups: when requesting the ancestors for an incremental backup X, the ancestors could include both full and incremental backups that predate the full backup on which X is built.
This caused a crash in incremental backup creation when data of old incremental backups was deleted through other means than the HBase API (i.e. without the HBase backup system table being updated).