-
Notifications
You must be signed in to change notification settings - Fork 29k
The default version of yarn is equal to the hadoop version #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also remove these changes around restructuring the dependency locations? These need to be tested and verified separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bascially what I want here is a ~10 to 15 line pull request we can verify and merge in quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not very good, otherwise the default value is 1.0.4
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>1.0.4</version>
</dependency>There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the problem exactly? We don't ever rely on the default version here, right? If someone tries to build with -Pyarn but they don't set the hadoop version to be higher - the build can fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. but in mvn -Pyarn clean package, the hadoop version is 2.2.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving the dependency declarations as they are now, mvn -DskipTests clean package can not be executed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When hadop.version is 1.0.4 , yarn.version is also 1.0.4
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>${yarn.version}</version>
</dependency>
is not correct, so mvn -DskipTests clean package can not be executed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. I didn't understand what you were saying before. Is the issue just that the dependency resolution fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm sorry, my english is relatively bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pwendell
I tested the current environment ,when leaving the yarn dependency declarations as they are now,mvn -DskipTests clean package can work.I will restore modified.
I suggest to change back after 1.0 release
|
Great - thanks for paring this down. I can merge it. Let's look at cleaning this up once we ship 1.0. |
This is a part of [PR 590](#590) Author: witgo <witgo@qq.com> Closes #626 from witgo/yarn_version and squashes the following commits: c390631 [witgo] restore the yarn dependency declarations f8a4ad8 [witgo] revert remove the dependency of avro in yarn-alpha 2df6cf5 [witgo] review commit a1d876a [witgo] review commit 20e7e3e [witgo] review commit c76763b [witgo] The default value of yarn.version is equal to hadoop.version (cherry picked from commit fb05432) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
This is a part of [PR 590](apache#590) Author: witgo <witgo@qq.com> Closes apache#626 from witgo/yarn_version and squashes the following commits: c390631 [witgo] restore the yarn dependency declarations f8a4ad8 [witgo] revert remove the dependency of avro in yarn-alpha 2df6cf5 [witgo] review commit a1d876a [witgo] review commit 20e7e3e [witgo] review commit c76763b [witgo] The default value of yarn.version is equal to hadoop.version
…arrow enabled (apache#626) Cherry-pick of apache#24677
Iceberg 0.13.0.3 - ADT 1.1.7 2022-05-20 PRs Merged * Internal: Parquet bloom filter support (apache#594 (https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/594)) * Internal: AWS Kms Client (apache#630 (https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/630)) * Internal: Core: Add client-side check of encryption properties (apache#626 (https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/626)) * Core: Align snapshot summary property names for delete files (apache#4766 (apache/iceberg#4766)) * Core: Add eq and pos delete file counts to snapshot summary (apache#4677 (apache/iceberg#4677)) * Spark 3.2: Clean static vars in SparkTableUtil (apache#4765 (apache/iceberg#4765)) * Spark 3.2: Avoid reflection to load metadata tables in SparkTableUtil (apache#4758 (apache/iceberg#4758)) * Core: Fix query failure when using projection on top of partitions metadata table (apache#4720) (apache#619 (https://github.pie.apple.com/IPR/apache-incubator-iceberg/pull/619)) Key Notes Bloom filter support and Client Side Encryption Features can be used in this release. Both features are only enabled with explicit flags and will not effect existing tables or jobs.
…apache#626) ### What changes were proposed in this pull request? refactor: In `ExplainUtils.processPlan`, use auxiliary idMap instead of OP_ID_TAG ### Why are the changes needed? apache#45282 introduced synchronize to `ExplainUtils.processPlan` to avoid race condition when multiple queries refers to same cached plan. The granularity of lock is too large. We can try to fix the root cause of this concurrency issue by refactoring the usage of mutable `OP_ID_TAG`, which is not a good practice in terms of immutable nature of SparkPlan. Instead, we can use an auxiliary id map, with object identity as the key. The entire scope of `OP_ID_TAG` usage is within `ExplainUtils.processPlan`, therefore it's safe to do so, with thread local to make it available in other involved classes. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? existing UTs. ### Was this patch authored or co-authored using generative AI tooling? NO Closes apache#46965 from liuzqt/SPARK-48610. Authored-by: Ziqi Liu <ziqi.liu@databricks.com> (cherry picked from commit d3da240) Signed-off-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Ziqi Liu <ziqi.liu@databricks.com>
This is a part of PR 590