Skip to content

Conversation

@vinayakphegde
Copy link
Contributor

@vinayakphegde vinayakphegde commented Feb 24, 2025

This PR introduces the pitr command to restore tables from the most recent backup before --to-datetime and replay WAL logs for changes made after the last backup.

Command Usage:

hbase pitr 
    [-t <table_name[,table_name]>] 
    [-s <backup_set_name>] 
    [-q <name>] 
    [-c] 
    [-m <target_tables>] 
    [-o] 
    [--to-datetime <end_time>] 

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

} else {
// This server is no longer active (e.g., RS moved or removed); skip
if (LOG.isDebugEnabled()) {
LOG.debug("Skipping replication marker timestamp for invalid server: {}", server);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is theLOG.isDebugEnabled() check necessary here? Shouldn't log4j just not log this message if the log level is set to INFO or higher?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it's unnecessary here. We typically use the conditional check when building a complex string or performing expensive operations, which isn't the case in this line.

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, but surprising how low the level of unit tests' granularity is: for instance BackupAdminImpl is such a big class and there's no corresponding test class. It has a lot of private methods which could be tested as "units" if they were package private and we had a BackupAdminTest class. Please consider that.

return request;
}

public static PointInTimeRestoreRequest createPointInTimeRestoreRequest(String backupRootDir,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since you have Builder implemented for PointInTimeRestoreRequest, this method is redundant. Creating a new request is already a one-liner.

} else {
// This server is no longer active (e.g., RS moved or removed); skip
if (LOG.isDebugEnabled()) {
LOG.debug("Skipping replication marker timestamp for invalid server: {}", server);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"inactive" maybe instead of "invalid" - could be slightly less misleading, these servers are probably not invalid

@anmolnar
Copy link
Contributor

@vinayakphegde You also need to rebase the patch. Some flaky tests have already been fixed on the feature branch.

@vinayakphegde
Copy link
Contributor Author

Overall lgtm, but surprising how low the level of unit tests' granularity is: for instance BackupAdminImpl is such a big class and there's no corresponding test class. It has a lot of private methods which could be tested as "units" if they were package private and we had a BackupAdminTest class. Please consider that.

True, there isn’t a separate test class specifically for BackupAdminImpl. However, if you look at the overall structure, the core implementation resides in BackupAdminImpl, while the other classes primarily handle input preparation (including validation) and delegate the calls to BackupAdminImpl. We have a lot of tests for those classes, which indirectly cover almost all parts of BackupAdminImpl.

That said, I’ll still take a closer look at the class and add tests if I find any gaps.

@anmolnar
Copy link
Contributor

Overall lgtm, but surprising how low the level of unit tests' granularity is: for instance BackupAdminImpl is such a big class and there's no corresponding test class. It has a lot of private methods which could be tested as "units" if they were package private and we had a BackupAdminTest class. Please consider that.

True, there isn’t a separate test class specifically for BackupAdminImpl. However, if you look at the overall structure, the core implementation resides in BackupAdminImpl, while the other classes primarily handle input preparation (including validation) and delegate the calls to BackupAdminImpl. We have a lot of tests for those classes, which indirectly cover almost all parts of BackupAdminImpl.

That said, I’ll still take a closer look at the class and add tests if I find any gaps.

Could you please provide examples of that?
Are they really just proxy calls?

@vinayakphegde
Copy link
Contributor Author

Overall lgtm, but surprising how low the level of unit tests' granularity is: for instance BackupAdminImpl is such a big class and there's no corresponding test class. It has a lot of private methods which could be tested as "units" if they were package private and we had a BackupAdminTest class. Please consider that.

True, there isn’t a separate test class specifically for BackupAdminImpl. However, if you look at the overall structure, the core implementation resides in BackupAdminImpl, while the other classes primarily handle input preparation (including validation) and delegate the calls to BackupAdminImpl. We have a lot of tests for those classes, which indirectly cover almost all parts of BackupAdminImpl.
That said, I’ll still take a closer look at the class and add tests if I find any gaps.

Could you please provide examples of that? Are they really just proxy calls?

Yes, for example, mergeBackups() is already covered by tests in TestBackupMerge, TestIncrementalBackupMergeWithBulkLoad, and IntegrationTestBackupRestore. The checkIfValidForMerge() method is private and is only called from mergeBackups(), which is the case for most of the methods in this class.

That said, we can definitely create a dedicated test class for this and add the remaining/uncovered test cases. I’ll open a separate Jira for that. Does that sound good?

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@anmolnar
Copy link
Contributor

Overall lgtm, but surprising how low the level of unit tests' granularity is: for instance BackupAdminImpl is such a big class and there's no corresponding test class. It has a lot of private methods which could be tested as "units" if they were package private and we had a BackupAdminTest class. Please consider that.

True, there isn’t a separate test class specifically for BackupAdminImpl. However, if you look at the overall structure, the core implementation resides in BackupAdminImpl, while the other classes primarily handle input preparation (including validation) and delegate the calls to BackupAdminImpl. We have a lot of tests for those classes, which indirectly cover almost all parts of BackupAdminImpl.
That said, I’ll still take a closer look at the class and add tests if I find any gaps.

Could you please provide examples of that? Are they really just proxy calls?

Yes, for example, mergeBackups() is already covered by tests in TestBackupMerge, TestIncrementalBackupMergeWithBulkLoad, and IntegrationTestBackupRestore. The checkIfValidForMerge() method is private and is only called from mergeBackups(), which is the case for most of the methods in this class.

That said, we can definitely create a dedicated test class for this and add the remaining/uncovered test cases. I’ll open a separate Jira for that. Does that sound good?

Yeah, these examples are all integration tests as the same suggests in some cases: they create "real"/mini HBase cluster, sets up a connection, get reference to Admin and call methods which are publicly accessible. Looks like this is historically the main approach for HBase testing, but this is not unit testing.

What I'm talking about is instantiating BackupAdminImpl class directly with a mocked connection and test the methods individually. In order to do this you have to make methods accessible from unit tests, for instance, by changing visibility to package private. I don't insist doing this strictly, just leaving here as something to think about. Let me approve this patch regardless.

Copy link
Contributor

@kgeisz kgeisz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I just have some minor comments, most of which are for AbstractRestoreDriver.java.

Comment on lines +82 to +83
boolean overwrite = cmd.hasOption(OPTION_OVERWRITE);
if (overwrite) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - I see in Line 78 you're just using the hasOption() method and not assigning its output to a variable. Same with Lines 94, 101, and 107. Maybe this should be like that for consistency?

Suggested change
boolean overwrite = cmd.hasOption(OPTION_OVERWRITE);
if (overwrite) {
if (cmd.hasOption(OPTION_OVERWRITE)) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I am using the value of overwrite in another place (line 152).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I missed that.

Comment on lines +88 to +89
boolean check = cmd.hasOption(OPTION_CHECK);
if (check) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the previous comment

Suggested change
boolean check = cmd.hasOption(OPTION_CHECK);
if (check) {
if (cmd.hasOption(OPTION_CHECK)) {

Copy link
Contributor

@kgeisz kgeisz May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that you're using this at the end of the function as well, so no change is needed here.

Comment on lines 84 to 85
LOG.debug("Found -overwrite option in restore command, "
+ "will overwrite to existing table if any in the restore target");
Copy link
Contributor

@kgeisz kgeisz May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - I see you have -o, but from what I can tell -overwrite is not an available alternative option. It may be more clear to have the actual option in parenthesis.

Suggested change
LOG.debug("Found -overwrite option in restore command, "
+ "will overwrite to existing table if any in the restore target");
LOG.debug("Found overwrite option (-{}) in restore command, "
+ "will overwrite to existing table if any in the restore target", OPTION_OVERWRITE);

Comment on lines 90 to 91
LOG.debug(
"Found -check option in restore command, " + "will check and verify the dependencies");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - Same as the previous comment

Suggested change
LOG.debug(
"Found -check option in restore command, " + "will check and verify the dependencies");
LOG.debug(
"Found check option (-{}) in restore command, will check and verify the dependencies",
OPTION_CHECK);

Comment on lines 95 to 96
System.err.println(
"Options -s and -t are mutually exclusive," + " you can not specify both of them.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a little info to quickly remind the user what -s and -t do?

Suggested change
System.err.println(
"Options -s and -t are mutually exclusive," + " you can not specify both of them.");
System.err.printf(
"Set name (-%s) and table list (-%s) are mutually exclusive, you can not specify both "
+ "of them.%n", OPTION_SET, OPTION_TABLE);

If you decide this isn't necessary, then please remove the unnecessary +:

Suggested change
System.err.println(
"Options -s and -t are mutually exclusive," + " you can not specify both of them.");
System.err.println(
"Options -s and -t are mutually exclusive, you can not specify both of them.");

}
public class RestoreDriver extends AbstractRestoreDriver {
private static final String USAGE_STRING = """
Usage: hbase restore <backup_path> <backup_id> [options]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should <table(s)> be in this usage line as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is already included in the options list. 3 line below.

Comment on lines 806 to 807
long dirEndTime = dirStartTime + ONE_DAY_IN_MILLISECONDS - 1; // End time of the day
// (23:59:59)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little odd to have an inline comment on two lines. It's probably because of mvn spotless:apply. You can probably get it all on one line if you trim the comment down:

Suggested change
long dirEndTime = dirStartTime + ONE_DAY_IN_MILLISECONDS - 1; // End time of the day
// (23:59:59)
long dirEndTime = dirStartTime + ONE_DAY_IN_MILLISECONDS - 1; // End time of day (23:59:59)

Comment on lines 233 to 236
// Capture the timestamp of the last WAL entry processed. This is used as the replication
// checkpoint
// so that point-in-time restores know the latest consistent time up to which replication has
// occurred.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - It looks like there is an accidental newline after checkpoint.

Suggested change
// Capture the timestamp of the last WAL entry processed. This is used as the replication
// checkpoint
// so that point-in-time restores know the latest consistent time up to which replication has
// occurred.
// Capture the timestamp of the last WAL entry processed. This is used as the replication
// checkpoint so that point-in-time restores know the latest consistent time up to which
// replication has occurred.

* backups with or without continuous backup enabled. 4. Ensuring replication is complete before
* proceeding.
*/
private static void setUpBackupUps() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method name a typo? Should it say setUpBackups()?

String targetTableNames =
Arrays.stream(targetTables).map(TableName::getNameAsString).collect(Collectors.joining(","));

return new String[] { "-t", sourceTableNames, "-m", targetTableNames, "-" + OPTION_TO_DATETIME,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be safe, you can use "-" + OPTION_TABLE instead of -t in case the option is ever changed. Similar for -m and the args in buildBackupArgs().

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ HBASE-28957 Compile Tests _
+0 🆗 mvndep 0m 17s Maven dependency ordering for branch
+1 💚 mvninstall 3m 17s HBASE-28957 passed
+1 💚 compile 2m 12s HBASE-28957 passed
+1 💚 javadoc 2m 10s HBASE-28957 passed
+1 💚 shadedjars 5m 59s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 3m 2s the patch passed
+1 💚 compile 2m 12s the patch passed
+1 💚 javac 2m 12s the patch passed
+1 💚 javadoc 2m 9s the patch passed
+1 💚 shadedjars 5m 58s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 3m 9s /patch-unit-root.txt root in the patch failed.
32m 24s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6717/11/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6717
JIRA Issue HBASE-29133
Optional Tests javac javadoc unit compile shadedjars
uname Linux cfa9f4152a3f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-28957 / 04afea2
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6717/11/testReport/
Max. process+thread count 315 (vs. ulimit of 30000)
modules C: hbase-backup . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6717/11/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 58s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 shelldocs 0m 0s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ HBASE-28957 Compile Tests _
+0 🆗 mvndep 0m 38s Maven dependency ordering for branch
+1 💚 mvninstall 4m 36s HBASE-28957 passed
+1 💚 compile 9m 30s HBASE-28957 passed
+1 💚 checkstyle 1m 19s HBASE-28957 passed
+1 💚 spotbugs 9m 18s HBASE-28957 passed
+1 💚 spotless 0m 51s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 3m 44s the patch passed
+1 💚 compile 9m 59s the patch passed
-0 ⚠️ javac 9m 59s /results-compile-javac-root.txt root generated 1 new + 1273 unchanged - 0 fixed = 1274 total (was 1273)
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 26s /buildtool-patch-checkstyle-root.txt The patch fails to run checkstyle in root
+1 💚 shellcheck 0m 2s No new issues.
+1 💚 spotbugs 10m 28s the patch passed
+1 💚 hadoopcheck 13m 7s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 53s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 22s The patch does not generate ASF License warnings.
76m 24s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6717/11/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6717
JIRA Issue HBASE-29133
Optional Tests dupname asflicense codespell detsecrets shellcheck shelldocs spotless javac spotbugs checkstyle compile hadoopcheck hbaseanti
uname Linux c8bcb4e6215d 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-28957 / 04afea2
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 189 (vs. ulimit of 30000)
modules C: hbase-backup . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6717/11/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3 shellcheck=0.8.0
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@taklwu taklwu merged commit 5fc58af into apache:HBASE-28957 May 30, 2025
1 check failed
anmolnar pushed a commit that referenced this pull request Jul 28, 2025
Signed-off-by: Andor Molnar <andor@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
vinayakphegde added a commit to vinayakphegde/hbase that referenced this pull request Jul 29, 2025
…he#6717)

Signed-off-by: Andor Molnar <andor@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
vinayakphegde added a commit to vinayakphegde/hbase that referenced this pull request Jul 29, 2025
…he#6717)

Signed-off-by: Andor Molnar <andor@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
anmolnar pushed a commit that referenced this pull request Sep 11, 2025
Signed-off-by: Andor Molnar <andor@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
anmolnar pushed a commit that referenced this pull request Nov 6, 2025
Signed-off-by: Andor Molnar <andor@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants