-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29310 Handle Bulk Load Operations in Continuous Backup #7150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances bulk load tooling and point-in-time restore (PITR) to handle bulk-load operations in continuous backup workflows.
- Adds a post-bulkload suggestion to take a backup
- Extends PITR validation to detect bulk-loaded files since the last backup and adds a
--forceoption - Updates model and storage of bulk-load timestamps and adapts tests and CLI to support the new flow
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| BulkLoadHFilesTool.java | Prints a suggestion to back up after successful bulk load |
| TestPointInTimeRestore.java | Exposed overloaded buildPITRArgs with force flag and added force restore tests |
| TestIncrementalBackupWithContinuous.java | Made test counters dynamic, updated performBulkLoad and loadTable signatures |
| BulkLoad.java | Introduced timestamp field and getter |
| BackupSystemTable.java | Captured cell timestamp and passed it into the BulkLoad constructor |
| BackupAdminImpl.java | Propagated isForce through PITR validation, added checkBulkLoadAfterBackup |
| PointInTimeRestoreRequest.java | Added force parameter to the request model |
| PointInTimeRestoreDriver.java | Parsed -f/--force CLI flag and wired it into the request builder |
| BackupRestoreConstants.java | Defined constants for the force-restore option |
Comments suppressed due to low confidence (1)
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupAdminImpl.java:633
- The Javadoc for
validatePitrdoes not mention the newisForceparameter. Please update the method’s Javadoc to describe this argument.
private void validatePitr(long endTime, TableName[] sTableArray, TableName[] tTableArray,
| System.out.println("Bulk load completed successfully."); | ||
| System.out.println("IMPORTANT: Please take a backup of the table immediately if this table " |
Copilot
AI
Jul 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the framework’s logger (e.g., LOG.info) instead of System.out.println for consistency with the rest of HBase’s logging.
| System.out.println("Bulk load completed successfully."); | |
| System.out.println("IMPORTANT: Please take a backup of the table immediately if this table " | |
| LOG.info("Bulk load completed successfully."); | |
| LOG.info("IMPORTANT: Please take a backup of the table immediately if this table " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not related, you can ignore it.
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BulkLoad.java
Show resolved
Hide resolved
|
|
||
| boolean force = cmd.hasOption(OPTION_FORCE_RESTORE); | ||
| if (force) { | ||
| LOG.debug("Found force option (-{}) in restore command, " |
Copilot
AI
Jul 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This debug-level message may not be visible in default logs; consider logging at INFO or printing a user-facing warning so users know they’ve forced a restore.
| LOG.debug("Found force option (-{}) in restore command, " | |
| LOG.info("Found force option (-{}) in restore command, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe WARN level?
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please explain to us why we would like the force option to be implemented.
| String OPTION_FORCE_RESTORE = "f"; | ||
| String LONG_OPTION_FORCE_RESTORE = "force"; | ||
| String OPTION_FORCE_RESTORE_DESC = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you explain why should we support this use case if missing backup could cause data loss ?
|
|
||
| boolean force = cmd.hasOption(OPTION_FORCE_RESTORE); | ||
| if (force) { | ||
| LOG.debug("Found force option (-{}) in restore command, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe WARN level?
| System.out.println("Bulk load completed successfully."); | ||
| System.out.println("IMPORTANT: Please take a backup of the table immediately if this table " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not related, you can ignore it.
|
also, please try to fix the checkstyle and javac if possible. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
kgeisz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, as long as we're okay with the user being able to perform a backup without performing a bulk load first.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
kgeisz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. I just have one comment.
| throw new IOException("Bulk load operation detected after last successful backup for " | ||
| + "table: " + sTableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this error message should be a little more descriptive? It is saying why there is an error, but it doesn't necessarily tell the user what they should be doing instead to prevent this error.
My understanding is the user is supposed to perform a full or incremental backup after doing a bulkload. To me, this function is detecting that a bulkload has occurred since the last backup, and it is correctly throwing an error. However, the message isn't telling the user they should do another backup after bulkloading in order to get around this error.
|
let's wait till the tests complete and then we can merge |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…7150) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…7150) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…pache#7150)" This reverts commit 5ac2a73.
…pache#7150)" This reverts commit 5ac2a73.
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…pache#7150)" This reverts commit 5ac2a73.
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
This PR addresses these 2 improvements:
Logging Suggestion during successful bulkload operation:
After a bulk load operation completes, log a message suggesting the user perform a full or incremental backup.
PITR Enhancements:
During PITR, check if any bulk load operation occurred after the last successful backup.
If no such backup exists, inform the user and fail the process.
JIRA: https://issues.apache.org/jira/browse/HBASE-29310