-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-19635. ABFS: Marker creation failure should not be propagated #7825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
============================================================
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the Azure Blob File System (ABFS) implementation to prevent marker creation failures from being propagated to users. Marker creation is a best-effort operation that writes 0-byte files to indicate folder presence, but failures in this operation should not cause the primary operation to fail.
- Wrapped marker creation calls in try-catch blocks to catch
AbfsRestOperationException - Added debug logging when marker creation fails instead of propagating the exception
- Added a test to verify that
getPathStatusoperations complete successfully even when marker creation fails
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| AbfsBlobClient.java | Added exception handling around marker creation in setPathProperties and getPathStatus methods |
| ITestAzureBlobFileSystemOauth.java | Added test case to verify marker creation failures don't propagate during getPathStatus operations |
...-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java
Outdated
Show resolved
Hide resolved
...-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java
Outdated
Show resolved
Hide resolved
.../hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemOauth.java
Outdated
Show resolved
Hide resolved
| this.createPathRestOp(path, false, false, false, null, | ||
| contextEncryptionAdapter, tracingContext); | ||
| try { | ||
| this.createPathRestOp(path, false, false, false, null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case to check this change is not added, should we add that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is no longer needed
| this.createPathRestOp(path, false, false, false, null, | ||
| contextEncryptionAdapter, tracingContext); | ||
| } catch (AbfsRestOperationException exception) { | ||
| LOG.debug("Marker creation failed for path {} during setPathProperties", path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better we can log exception status code as well or some details of exception just to know what the exact reason of the failure is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on this.
At least status code and storage error code (Enum like PATHNOTFOUND) we should print in debug log.
Also a comment here on why we are swallowing the error for future reference will be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
| this.createMarkerAtPath(path, null, contextEncryptionAdapter, | ||
| tracingContext); | ||
| } catch (AbfsRestOperationException exception) { | ||
| LOG.debug("Marker creation failed for path {} during getPathStatus ", path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
| this.createPathRestOp(path, false, false, false, null, | ||
| contextEncryptionAdapter, tracingContext); | ||
| } catch (AbfsRestOperationException exception) { | ||
| LOG.debug("Marker creation failed for path {} during setPathProperties", path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on this.
At least status code and storage error code (Enum like PATHNOTFOUND) we should print in debug log.
Also a comment here on why we are swallowing the error for future reference will be better.
| try { | ||
| this.createMarkerAtPath(path, null, contextEncryptionAdapter, | ||
| tracingContext); | ||
| } catch (AbfsRestOperationException exception) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was keeping this behind a config not part of the plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we planned to just swallow exceptions instead of config changes
| // Implicit path found. | ||
| // Create a marker blob at this path. | ||
| this.createMarkerAtPath(path, null, contextEncryptionAdapter, tracingContext); | ||
| try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For confirmation- we want to swallow exceptions only for metadata related operations and allow permission exceptions for all write ones (createDirectory, createFile), correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
marker creation is best effort for us, we don't want to fail the actual operation called if marker creation fails, so only for marker creation we are swallowing exception and not blocking the actual call
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| * */ | ||
| @Test | ||
| public void testGetPathStatusWithReader() throws Exception { | ||
| String clientId = this.getConfiguration().get(TestConfigurationKeys.FS_AZURE_BLOB_DATA_READER_CLIENT_ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since we have already imported TestConfigurationKeys constants we can use FS_AZURE_BLOB_DATA_READER_CLIENT_ID and FS_AZURE_BLOB_DATA_READER_CLIENT_SECRET directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
anujmodi2021
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
LGTM
bhattmanish98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
…pache#7825) Contributed by Anmol Asrani
Marker creation is a best-effort operation performed during folder-related actions such as create, getPathStatus, setPathProperties, and rename. It involves writing a 0-byte file to indicate the presence of a folder. However, marker creation is not critical to the success of the primary operation. This change ensures that failures encountered during marker creation (e.g., due to transient issues or permission errors) are not propagated back to the user, preserving expected behavior and preventing unnecessary operation failures.