-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35868][CORE] Add fs.s3a.downgrade.syncable.exceptions if not set #33044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @sunchao , @steveloughran |
|
cc @gengliangwang for Apache Spark 3.2.0. |
sunchao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (non-binding) thanks @dongjoon-hyun !
|
Thank you, @sunchao ! |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
gengliangwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Test build #140229 has finished for PR 33044 at commit
|
|
Thank you so much, @gengliangwang ! The python UT failures are irrelevant. |
|
lgtm2 |
|
thx. FWIW, given its causing trouble, do you want this to be the default in hadoop default-xml? its there to stop people attempting to use s3 as a WAL for HBase or similar, but if applications have been treating it as a low-cost operation in general file IO, then we can just downgrade it broadly and rely on the hope that people don't do this. |
What changes were proposed in this pull request?
This PR aims to add
fs.s3a.downgrade.syncable.exceptions=trueif it's not provided by the users.Why are the changes needed?
Currently, event log feature is broken with Hadoop 3.2 profile due to
UnsupportedOperationExceptionbecause HADOOP-17597 changes the default behavior to throw exceptions by default since Apache Hadoop 3.3.1. We know that it's becauseEventLogFileWritersis usinghadoopDataStream.foreach(_.hflush()), but this PR aims to provide the same UX across Spark distributions with Hadoop2/Hadoop 3 at Apache Spark 3.2.0.Does this PR introduce any user-facing change?
Yes, this will recover the existing behavior.
How was this patch tested?
Manual.
If the users provide the configuration explicitly, it will return to the original behavior throwing exceptions.