-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-25317: Relocate dependencies in shaded hive-exec module #2459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks @sunchao. Seems caused by relocating avro, protobuf. Reverted the two relocations and see what's going on. |
|
@viirya there are still more errors: |
This reverts commit 4644b3b.
|
I'm not sure why this happens, or if it is caused by this change. |
|
I'll take a look @viirya - where did you find this error? do you have a link? I'm seeing these errors in the last run: |
|
The quoted test failure from "Testing / split-20 / Archive / testCommitCreateTablePlusCommitDropTableWithoutPurge – org.apache.hadoop.hive.druid.TestDruidStorageHandler" in http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-2459/11/tests. |
|
Yea, I'm trying to deal with |
|
I see. Yea this is very weird. I suspect there are something else that caused it to fail, for instance, when initializing some of the enum items. |
| <groupId>org.apache.hive</groupId> | ||
| <artifactId>hive-exec</artifactId> | ||
| <version>${project.version}</version> | ||
| <classifier>core</classifier> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use the core artifact - that's just bad!
what are you trying to achieve here with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
branch-2.3 ?
please note that changes should land on master first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As more dependencies are relocated here, some modules if they depends on non-core artifact, will cause class not found error...
The motivation is because we want to use shaded version of hive-exec (i.e., w/o classifier) in Spark to make sure it doesn't conflict guava version there. But there are more dependencies conflict with Spark. We need to relocate these dependencies too..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sunchao do we need to have similar change on master first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I think so since this PR is trying to shade things from the hive-exec-core I believe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should fix the issues with using the normal hive-exec artifact if there is any - loading the core jar could cause troubles...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: on branch2 guava is most likely not properly shaded away HIVE-22126
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kgyrtkirk Guava is shaded in branch-2.3 via https://issues.apache.org/jira/browse/HIVE-23980. The issue is, in order for Spark to use shaded hive-exec, Hive will need to relocate more classes and at the same time making sure it won't break other modules (for instance, if the shaded class appears in certain API and another module imported the unshaded version of the class by itself).
Currently we've abandoned this approach and decided to shade the hive-exec-core within Spark itself, following similar approach in Trino (see https://github.com/trinodb/trino-hive-apache).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could relocate/shade away those deps to make it possible for other projects to use the normal artifact - seems like there is a very good list in the trino project.
|
Close this as we are taking a different direction to shade dependencies of Hive at Spark. |
What changes were proposed in this pull request?
Trying to relocate dependencies which could conflict Spark.
Why are the changes needed?
When we want to use shaded version of hive-exec (i.e., w/o classifier), more dependencies conflict with Spark. We need to relocate these dependencies too.
Does this PR introduce any user-facing change?
If previously downstream projects rely on included dependencies in shaded release, they might need to explicitly include these dependencies after the relocation here.
How was this patch tested?
CI