Skip to content

Conversation

@vinayakphegde
Copy link
Contributor

Introduces a new configuration wal.input.ignore.empty.files to ignore the empty WAL files.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@vinayakphegde vinayakphegde changed the title HBASE-29219: Ignore Empty WAL Files While Consuming Backed-Up WAL Files HBASE-29219 Ignore Empty WAL Files While Consuming Backed-Up WAL Files Jun 16, 2025
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@kgeisz kgeisz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I just have a couple of minor comments.

Also, just curious - What happens when a WAL that has bogus/garbage values is read? Is an exception thrown? This patch adds the ability to skip the empty WALs, so it made me wonder what happens with bad/corrupted WALs.

Comment on lines 345 to 352
// Create an empty WAL file in a test input directory
FileSystem dfs = TEST_UTIL.getDFSCluster().getFileSystem();
Path inputDir = new Path("/empty-wal-dir");
dfs.mkdirs(inputDir);

Path emptyWAL = new Path(inputDir, "empty.wal");
FSDataOutputStream out = dfs.create(emptyWAL);
out.close(); // Creates a 0-byte file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I see some code that's used in both test methods, so you could put this in a function if you'd like. Something like:

private void createEmptyWALFile(String walDir) {
    // Create an empty WAL file in a test input directory
    FileSystem dfs = TEST_UTIL.getDFSCluster().getFileSystem();
    Path inputDir = new Path("/" + walDir);
    dfs.mkdirs(inputDir);
    Path emptyWAL = new Path(inputDir, "empty.wal");
    FSDataOutputStream out = dfs.create(emptyWAL);
    out.close(); // Creates a 0-byte file
}

Path emptyWAL = new Path(inputDir, "empty.wal");
FSDataOutputStream out = dfs.create(emptyWAL);
out.close();

Copy link
Contributor

@kgeisz kgeisz Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For consistency, maybe add the same assertions you have in the other test method regarding the newly created empty WAL file:

Suggested change
assertTrue("Empty WAL file should exist", dfs.exists(emptyWAL));
assertEquals("WAL file should be 0 bytes", 0, dfs.getFileStatus(emptyWAL).getLen());

Copy link
Contributor

@Kota-SH Kota-SH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

However, do we need a new config for this? Should we not skip empty WALs by default?

* have a timestamp, we will just return it w/o filtering.
*/
private List<FileStatus> getFiles(FileSystem fs, Path dir, long startTime, long endTime)
List<FileStatus> getFiles(FileSystem fs, Path dir, long startTime, long endTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If this is only for testing, can we add @VisibleForTesting here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is used in WALInputFormat class itself. I just made it package private so that I can use them in tests.

@vinayakphegde
Copy link
Contributor Author

LGTM overall. I just have a couple of minor comments.

Also, just curious - What happens when a WAL that has bogus/garbage values is read? Is an exception thrown? This patch adds the ability to skip the empty WALs, so it made me wonder what happens with bad/corrupted WALs.

Yes, We'll get below exception
java.lang.Exception: org.apache.hadoop.hbase.regionserver.wal.WALHeaderEOFException: EOF while reading PB WAL magic at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) ~[hadoop-mapreduce-client-common-3.4.1.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) ~[hadoop-mapreduce-client-common-3.4.1.jar:?] Caused by: org.apache.hadoop.hbase.regionserver.wal.WALHeaderEOFException: EOF while reading PB WAL magic at org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufWALReader.readHeader(AbstractProtobufWALReader.java:221) ~[classes/:?] at org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufWALReader.init(AbstractProtobufWALReader.java:147) ~[classes/:?] at org.apache.hadoop.hbase.wal.WALFactory.createStreamReader(WALFactory.java:417) ~[classes/:?] at org.apache.hadoop.hbase.wal.WALFactory.createStreamReader(WALFactory.java:538) ~[classes/:?] at org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.openReader(WALInputFormat.java:162) ~[classes/:?] at org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.openReader(WALInputFormat.java:204) ~[classes/:?] at org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.initialize(WALInputFormat.java:197) ~[classes/:?] at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:561) ~[hadoop-mapreduce-client-core-3.4.1.jar:?] at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) ~[hadoop-mapreduce-client-core-3.4.1.jar:?] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348) ~[hadoop-mapreduce-client-core-3.4.1.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) ~[hadoop-mapreduce-client-common-3.4.1.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:833) ~[?:?] Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:203) ~[?:?] at java.io.DataInputStream.readFully(DataInputStream.java:172) ~[?:?] at

@vinayakphegde
Copy link
Contributor Author

LGTM.

However, do we need a new config for this? Should we not skip empty WALs by default?

I'm not entirely sure. Skipping empty WALs by default might change existing behavior and potentially break some workflows where users expect exceptions to be thrown in such cases.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it's better to add a javadoc when we should set it true, especially for continuous backup and non-continuous backup

Configuration conf = HBaseConfiguration.create(conn.getConfiguration());
conf.setLong(WALInputFormat.START_TIME_KEY, startTime);
conf.setLong(WALInputFormat.END_TIME_KEY, endTime);
conf.setBoolean(IGNORE_EMPTY_FILES, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, this flag basically is only used for Continuous Backup? and should it be false other than the use case of Continuous Backup ?

nit: maybe add a javadoc comment in the code that at what situation we should and we should not use this flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the default behavior is false, so I didn’t want to change it as it might be assumed in other parts of the code.

nit: maybe add a Javadoc comment explaining in what situations we should or shouldn't use this flag.

Sure — this flag controls whether the WALPlayer job should throw an exception when it encounters an empty file that it can't parse as a valid WAL file, or whether it should skip it silently.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ HBASE-28957 Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 3m 12s HBASE-28957 passed
+1 💚 compile 1m 6s HBASE-28957 passed
-0 ⚠️ checkstyle 0m 10s /buildtool-branch-checkstyle-hbase-backup.txt The patch fails to run checkstyle in hbase-backup
+1 💚 spotbugs 0m 59s HBASE-28957 passed
+1 💚 spotless 0m 46s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 1m 4s the patch passed
+1 💚 javac 1m 4s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 9s /buildtool-patch-checkstyle-hbase-backup.txt The patch fails to run checkstyle in hbase-backup
+1 💚 spotbugs 1m 12s the patch passed
+1 💚 hadoopcheck 12m 1s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 44s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 17s The patch does not generate ASF License warnings.
33m 46s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7106/6/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7106
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 94a6c6dd635c 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-28957 / d8f931a
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 84 (vs. ulimit of 30000)
modules C: hbase-mapreduce hbase-backup U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7106/6/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ HBASE-28957 Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for branch
+1 💚 mvninstall 3m 7s HBASE-28957 passed
+1 💚 compile 0m 40s HBASE-28957 passed
+1 💚 javadoc 0m 27s HBASE-28957 passed
+1 💚 shadedjars 5m 58s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 3m 12s the patch passed
+1 💚 compile 0m 40s the patch passed
+1 💚 javac 0m 40s the patch passed
+1 💚 javadoc 0m 27s the patch passed
+1 💚 shadedjars 5m 59s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 27m 12s /patch-unit-hbase-mapreduce.txt hbase-mapreduce in the patch failed.
-1 ❌ unit 23m 2s /patch-unit-hbase-backup.txt hbase-backup in the patch failed.
73m 13s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7106/6/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7106
Optional Tests javac javadoc unit compile shadedjars
uname Linux 1e90e7f5e0f5 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-28957 / d8f931a
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7106/6/testReport/
Max. process+thread count 3731 (vs. ulimit of 30000)
modules C: hbase-mapreduce hbase-backup U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7106/6/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@taklwu taklwu merged commit a7a6d3c into apache:HBASE-28957 Jun 24, 2025
1 check failed
anmolnar pushed a commit that referenced this pull request Jul 28, 2025
#7106)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
vinayakphegde added a commit to vinayakphegde/hbase that referenced this pull request Jul 29, 2025
apache#7106)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>   
Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
vinayakphegde added a commit to vinayakphegde/hbase that referenced this pull request Jul 29, 2025
apache#7106)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
anmolnar pushed a commit that referenced this pull request Sep 11, 2025
#7106)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
anmolnar pushed a commit that referenced this pull request Nov 6, 2025
#7106)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants