Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HubSpot Backport: HBASE-28680 BackupLogCleaner causes HMaster WALs to pile up indefintely (#6006) #100

Merged
merged 1 commit into from
Jul 8, 2024

Conversation

rmdmattingly
Copy link

We have been trying to setup daily incremental backups for hundreds of clusters at my day job. Recently we discovered that old WALs were piling up across many clusters inline with when we began running incremental backups.

This led to the realization that the BackupLogCleaner will always skip archived HMaster WALs. This is a problem because, if a cleaner is skipping a given file, then the CleanerChore will never delete it.

This seems like a misunderstanding of what it means to "skip" a WAL in a BaseLogCleanerDelegate, and, instead, we should always return these HMaster WALs as deletable from the perspective of the BackupLogCleaner. We could subject them to the same scrutiny as RegionServer WALs: are they older than the most recent successful backup? But, if I understand correctly, HMaster WALs do not contain any data relevant to table backups, so that would be unnecessary.

…ely (apache#6006)

We have been trying to setup daily incremental backups for hundreds of clusters at my day
job. Recently we discovered that old WALs were piling up across many clusters inline with when we
began running incremental backups.

This led to the realization that the BackupLogCleaner will always skip archived HMaster WALs. This
is a problem because, if a cleaner is skipping a given file, then the CleanerChore will never
delete it.

This seems like a misunderstanding of what it means to "skip" a WAL in a BaseLogCleanerDelegate,
and, instead, we should always return these HMaster WALs as deletable from the perspective of the
BackupLogCleaner. We could subject them to the same scrutiny as RegionServer WALs: are they older
than the most recent successful backup? But, if I understand correctly, HMaster WALs do not
contain any data relevant to table backups, so that would be unnecessary.

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
@rmdmattingly rmdmattingly merged commit 84551d6 into hubspot-2.6 Jul 8, 2024
1 check passed
@rmdmattingly rmdmattingly deleted the 28680-hubspot-2.6 branch July 8, 2024 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants