-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29519 Copy Bulkloaded Files in Continuous Backup #7222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements bulk-loaded file copying functionality for continuous backup in HBase. The feature ensures that bulk-loaded files are replicated and backed up along with WAL entries during continuous backup operations.
- Adds bulk load processing capability to extract file paths from WAL entries containing bulk load descriptors
- Implements file upload mechanism to copy bulk-loaded files to backup storage
- Enhances test coverage for bulk load scenarios and validates backup integrity
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| BulkLoadProcessor.java | New utility class to process WAL entries and extract bulk load file paths |
| ContinuousBackupReplicationEndpoint.java | Enhanced to handle bulk load file backup during WAL replication |
| BackupFileSystemManager.java | Added bulk load files directory management |
| TestBulkLoadProcessor.java | Unit tests for bulk load processing functionality |
| TestContinuousBackupReplicationEndpoint.java | Updated tests to include bulk load scenarios and verification |
| FullTableBackupClient.java | Added informational message when bulk load replication is disabled |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| assertEquals(2, paths.size()); | ||
| System.out.println(paths); | ||
|
|
Copilot
AI
Aug 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed from production test code. Use proper logging if debug output is needed for troubleshooting.
| assertEquals(2, paths.size()); |
|
|
||
| assertEquals(2, paths.size()); | ||
| System.out.println(paths); | ||
|
|
Copilot
AI
Aug 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed from production test code. Use proper logging if debug output is needed for troubleshooting.
| assertEquals(2, paths.size()); |
|
|
||
| List<Path> paths = BulkLoadProcessor.processBulkLoadFiles(Arrays.asList(entry, entry2)); | ||
| System.out.println(paths); | ||
|
|
Copilot
AI
Aug 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed from production test code. Use proper logging if debug output is needed for troubleshooting.
| backupInfo.setState(BackupState.COMPLETE); | ||
|
|
||
| if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) { | ||
| System.out.println("NOTE: Bulkload replication is not enabled. " |
Copilot
AI
Aug 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using System.out.println for user notifications in production code is not recommended. Consider using proper logging (LOG.info) or a more appropriate output mechanism for user-facing messages.
| System.out.println("NOTE: Bulkload replication is not enabled. " | |
| LOG.warn("NOTE: Bulkload replication is not enabled. " |
| } | ||
|
|
||
| walWriter.sync(true); | ||
| uploadBulkLoadFiles(day, bulkLoadFiles); |
Copilot
AI
Aug 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bulk load file upload is performed synchronously for each WAL entry batch. This could become a performance bottleneck if there are many bulk load files. Consider batching or asynchronous processing for better performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we considered and discussed this. We decided to keep it synchronous for now, as making it asynchronous would add complexity and might be a premature optimization. We can revisit this approach if it turns out to be a real performance bottleneck.
Any other thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put my questions in the JIRA, let me link here as well, it's mainly about if we do any async upload or upload a list of files at the end of the WAL closing, we will need to be very careful and may have a different implementation. so having current approach may be the best for the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, going with async has all the issues you mentioned. We’ll keep it as is for now.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we open this PR against apache:HBASE-28957_rebase ?
No need. This branch is rebased by @anmolnar already. |
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can update the comment for when hbase.replication.bulkload.enabled is disable, for now let's suppose we have instructions for user to manually move related bulkload to the backup location. if you think this should not be supported, we should fail the replication
| + "(hbase.replication.bulkload.enabled=true) and configure other necessary settings " | ||
| + "to properly enable bulkload replication."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Message is confusing. "You" are HBase, so please tell the user exactly what needs to be done in order to enable bulkload backup.
|
|
||
| if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) { | ||
| System.out.println("NOTE: Bulkload replication is not enabled. " | ||
| + "Bulk loaded files will not be backed up as part of continuous backup. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since continuous backup is using HBase replication, bulk loaded files won't be backup up as part of continuous backup.
| backupInfo.setState(BackupState.COMPLETE); | ||
|
|
||
| if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) { | ||
| System.out.println("NOTE: Bulkload replication is not enabled. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARNING: ...
| String errorMsg = String.format("%s Failed to back up bulk load file %s to %s", | ||
| Utils.logPeerId(peerId), file, destPath); | ||
| LOG.error("{}: {}", errorMsg, e.getMessage(), e); | ||
| throw new IOException(errorMsg, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before returning from a failed upload operation you probably want to make sure there was no garbage left at the destination. For instance, if continuing the upload is not supported, the partially uploaded file should be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense. Since rename isn’t supported in Hadoop-S3, can we do something like
try the copy,
if it fails, attempt to delete the partially uploaded file,
and in case delete also fails, the next retry will always overwrite the file anyway, so we don’t get stuck with stale data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. You could attempt another delete here in the catch block. Depending on the failure, you might be able to delete the partially uploaded file right here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we are doing that. missed to mention here.
| if (result == null) { | ||
| LOG.error("{} No bulk loaded file found in relative path: {}", Utils.logPeerId(peerId), | ||
| relativePathFromNamespace); | ||
| throw new IOException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to throw a different type of exception than what FileUtil.copy(...) operation throws.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
|
||
| if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) { | ||
| System.out.println("WARNING: Bulkload replication is not enabled. " | ||
| + "Since continuous backup is using HBase replication, bulk loaded files won't be backup up as part of continuous backup. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "backed up"
| static void copyWithCleanup(FileSystem srcFS, Path src, FileSystem dstFS, Path dst, | ||
| Configuration conf) throws IOException { | ||
| try { | ||
| if (dstFS.exists(dst)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the file exists and the size (or hash) equals to the file that you're going to upload, you can skip the entire upload process.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
anmolnar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
|
Thanks @anmolnar for the review. can we merge if it looks good? |
|
@vinayakphegde I merged it |
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Andor Molnár <andor@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Andor Molnár <andor@apache.org>
No description provided.