-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29219 Ignore Empty WAL Files While Consuming Backed-Up WAL Files #7106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. I just have a couple of minor comments.
Also, just curious - What happens when a WAL that has bogus/garbage values is read? Is an exception thrown? This patch adds the ability to skip the empty WALs, so it made me wonder what happens with bad/corrupted WALs.
| // Create an empty WAL file in a test input directory | ||
| FileSystem dfs = TEST_UTIL.getDFSCluster().getFileSystem(); | ||
| Path inputDir = new Path("/empty-wal-dir"); | ||
| dfs.mkdirs(inputDir); | ||
|
|
||
| Path emptyWAL = new Path(inputDir, "empty.wal"); | ||
| FSDataOutputStream out = dfs.create(emptyWAL); | ||
| out.close(); // Creates a 0-byte file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I see some code that's used in both test methods, so you could put this in a function if you'd like. Something like:
private void createEmptyWALFile(String walDir) {
// Create an empty WAL file in a test input directory
FileSystem dfs = TEST_UTIL.getDFSCluster().getFileSystem();
Path inputDir = new Path("/" + walDir);
dfs.mkdirs(inputDir);
Path emptyWAL = new Path(inputDir, "empty.wal");
FSDataOutputStream out = dfs.create(emptyWAL);
out.close(); // Creates a 0-byte file
}
| Path emptyWAL = new Path(inputDir, "empty.wal"); | ||
| FSDataOutputStream out = dfs.create(emptyWAL); | ||
| out.close(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: For consistency, maybe add the same assertions you have in the other test method regarding the newly created empty WAL file:
| assertTrue("Empty WAL file should exist", dfs.exists(emptyWAL)); | |
| assertEquals("WAL file should be 0 bytes", 0, dfs.getFileStatus(emptyWAL).getLen()); | |
Kota-SH
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
However, do we need a new config for this? Should we not skip empty WALs by default?
| * have a timestamp, we will just return it w/o filtering. | ||
| */ | ||
| private List<FileStatus> getFiles(FileSystem fs, Path dir, long startTime, long endTime) | ||
| List<FileStatus> getFiles(FileSystem fs, Path dir, long startTime, long endTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If this is only for testing, can we add @VisibleForTesting here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is used in WALInputFormat class itself. I just made it package private so that I can use them in tests.
Yes, We'll get below exception |
I'm not entirely sure. Skipping empty WALs by default might change existing behavior and potentially break some workflows where users expect exceptions to be thrown in such cases. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
156e26c to
261d066
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but it's better to add a javadoc when we should set it true, especially for continuous backup and non-continuous backup
| Configuration conf = HBaseConfiguration.create(conn.getConfiguration()); | ||
| conf.setLong(WALInputFormat.START_TIME_KEY, startTime); | ||
| conf.setLong(WALInputFormat.END_TIME_KEY, endTime); | ||
| conf.setBoolean(IGNORE_EMPTY_FILES, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, this flag basically is only used for Continuous Backup? and should it be false other than the use case of Continuous Backup ?
nit: maybe add a javadoc comment in the code that at what situation we should and we should not use this flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the default behavior is false, so I didn’t want to change it as it might be assumed in other parts of the code.
nit: maybe add a Javadoc comment explaining in what situations we should or shouldn't use this flag.
Sure — this flag controls whether the WALPlayer job should throw an exception when it encounters an empty file that it can't parse as a valid WAL file, or whether it should skip it silently.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
#7106) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
apache#7106) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
apache#7106) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
#7106) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
#7106) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Introduces a new configuration
wal.input.ignore.empty.filesto ignore the empty WAL files.