-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote cudf as dist direct dependency, mark aggregator provided #4043
Promote cudf as dist direct dependency, mark aggregator provided #4043
Conversation
Signed-off-by: Gera Shegalov <gera@apache.org>
build |
so is this actually the right thing to do? I thought our old dependency reduced pom didn't have the spark dependencies as provided in it, but it would be good to double check that. So this changes the scope to be compile? would be nice to have more description because we aren't removing the dependency from the pom we are changing the dependency scope. |
actually, I think I was wrong, must have been looking at the wrong pom, or dependency reduced from aggregator. Shows them as provided. |
I don't think this is what we want. If you look at the generated The dist pom is pulling in the aggregator pom with the understanding that we'll get all of the aggregator dependencies, transitively. That's exactly what we want during the compile/build phase, but when it comes to advertising the dependencies in the deployed artifact, we need to remove the aggregator jar from the dependency tree underneath us, but also move all of the aggregator jar dependencies up to become dist jar dependencies. The dist jar contains the aggregator jar contents but not its transitive dependencies, so the dist jar dependencies should look similar to the aggregator jar dependencies. In the end I would expect the dependency reduced pom to list the Apache Spark jars along with the cudf jar as provided dependencies. |
@jlowe please update #3935 to reflect that you mean cudf and other aggregator's immediate child dependencies need to be promoted to the dist jar as provided. The issue currently talks only about spark-hive and spark-sql. Look like we want to re-enable dependency-reduced-pom generation in the |
Argh, I don't think we can just grab the aggregator dependencies, since some of those dependencies have been pulled into the dist jar as part of the custom shading (sql-plugin, shuffle-plugin, etc.). We'd have to filter out the aggregator dependencies we know have been pulled in to the dist jar. As for listing whatever Spark jars are specified by the first aggregator, is that what we want? It would be nice to show the matrix of support -- we could put the |
I also thought of using profiles, propagating release3XY profiles with release301 being the default looks reasonable to me. |
ab7ca25
to
c9161d8
Compare
I investigated the option The only way I was able to make it work is by switching As for advertising the right version of the spark dependencies, I think we can skip adding profiles to this pom. We publish the parent pom. <?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>hello-world</artifactId>
<version>1.0-SNAPSHOT</version>
<name>Hello World</name>
<url>http://nvidia.com</url>
<dependencies>
<dependency>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark_2.12</artifactId>
<version>21.12.0-SNAPSHOT</version>
</dependency>
</dependencies>
</project> But if we want to see cudf and spark as part of the dependency tree they have to be |
Signed-off-by: Gera Shegalov <gera@apache.org>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reproduced the CI failure locally:
working on the fix. debugged the install plugin and It turns out that it's not safe to use the Another benefit of the static file is that it will be automatically consistent in META-INF/maven |
Signed-off-by: Gera Shegalov <gera@apache.org>
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
we will keep an eye on this to see if any CI setup need to be updated
Closes #3935
Signed-off-by: Gera Shegalov gera@apache.org