Skip to content

Commit

Permalink
[fix](hdfs) Fix hdfsExists that return staled root cause (apache#27991)
Browse files Browse the repository at this point in the history
The HDFS native client won't clear the last exception as expected so `hdfsGetLastExceptionRootCause` might return a staled root cause. This PR saves the last root cause here and verifies after hdfsExists returns a non-zero code.
  • Loading branch information
w41ter authored and 胥剑旭 committed Dec 14, 2023
1 parent dfe372a commit 1428754
Showing 1 changed file with 16 additions and 3 deletions.
19 changes: 16 additions & 3 deletions be/src/io/fs/hdfs_file_system.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -227,15 +227,28 @@ Status HdfsFileSystem::delete_internal(const Path& path, int is_recursive) {
Status HdfsFileSystem::exists_impl(const Path& path, bool* res) const {
CHECK_HDFS_HANDLE(_fs_handle);
Path real_path = convert_path(path, _fs_name);
#ifdef USE_HADOOP_HDFS
// HACK: the HDFS native client won't clear the last exception as expected so
// `hdfsGetLastExceptionRootCause` might return a staled root cause. Save the
// last root cause here and verify after hdfsExists returns a non-zero code.
//
// See details:
// https://github.com/apache/hadoop/blob/5cda162a804fb0cfc2a5ac0058ab407662c5fb00/
// hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jni_helper.c#L795
char* former_root_cause = hdfsGetLastExceptionRootCause();
#endif
int is_exists = hdfsExists(_fs_handle->hdfs_fs, real_path.string().c_str());
#ifdef USE_HADOOP_HDFS
// when calling hdfsExists() and return non-zero code,
// if root_cause is nullptr, which means the file does not exist.
// if root_cause is not nullptr, which means it encounter other error, should return.
// NOTE: not for libhdfs3 since it only runs on MaxOS, don't have to support it.
char* root_cause = hdfsGetLastExceptionRootCause();
if (root_cause != nullptr) {
return Status::IOError("failed to check path existence {}: {}", path.native(), root_cause);
if (is_exists != 0) {
char* root_cause = hdfsGetLastExceptionRootCause();
if (root_cause != nullptr && root_cause != former_root_cause) {
return Status::IOError("failed to check path existence {}: {}", path.native(),
root_cause);
}
}
#endif
*res = (is_exists == 0);
Expand Down

0 comments on commit 1428754

Please sign in to comment.