Skip to content

Conversation

@sunchao
Copy link
Member

@sunchao sunchao commented Sep 23, 2021

What changes were proposed in this pull request?

Enable createDependencyReducedPom for Spark's Maven shaded plugin so that the effective pom won't contain those shaded artifacts such as org.eclipse.jetty

Why are the changes needed?

At the moment, the effective pom leaks transitive dependencies to downstream apps for those shaded artifacts, which potentially will cause issues.

Does this PR introduce any user-facing change?

No

How was this patch tested?

I manually tested and the core/dependency-reduced-pom.xml no longer contains dependencies such as jetty-XX.

@github-actions github-actions bot added the BUILD label Sep 24, 2021
@sunchao
Copy link
Member Author

sunchao commented Sep 24, 2021

Local build looks good, and let's see how Spark CI reacts. cc @JoshRosen

@sunchao sunchao changed the title [SPARK-36835] Enable createDependencyReducedPom for Maven shaded plugin [SPARK-36835][BUILD] Enable createDependencyReducedPom for Maven shaded plugin Sep 24, 2021
@SparkQA
Copy link

SparkQA commented Sep 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48083/

Copy link
Contributor

@JoshRosen JoshRosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I did a local mvn install and confirmed that the generated POMs correctly exclude the shaded depdendencies.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@SparkQA
Copy link

SparkQA commented Sep 24, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48083/

@gengliangwang
Copy link
Member

Thanks, merging to master and branch-3.2

gengliangwang pushed a commit that referenced this pull request Sep 24, 2021
…ed plugin

### What changes were proposed in this pull request?

Enable `createDependencyReducedPom` for Spark's Maven shaded plugin so that the effective pom won't contain those shaded artifacts such as `org.eclipse.jetty`

### Why are the changes needed?

At the moment, the effective pom leaks transitive dependencies to downstream apps for those shaded artifacts, which potentially will cause issues.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

I manually tested and the `core/dependency-reduced-pom.xml` no longer contains dependencies such as `jetty-XX`.

Closes #34085 from sunchao/SPARK-36835.

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit ed88e61)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
@SparkQA
Copy link

SparkQA commented Sep 24, 2021

Test build #143574 has finished for PR 34085 at commit 08b1f31.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

+1, LGTM.

@gengliangwang
Copy link
Member

gengliangwang commented Sep 24, 2021

@sunchao @JoshRosen I find that after merging this PR, Spark can't build with Hadoop2.7. The build hangs for over 1 hour

$ ./build/mvn clean package -DskipTests -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-2.7 -Phive -Phive-thriftserver

...
[WARNING] spark-launcher_2.12-3.2.0.jar, unused-1.0.0.jar define 1 overlapping resource:
[WARNING]   - META-INF/MANIFEST.MF
[WARNING] maven-shade-plugin has detected that some class files are
[WARNING] present in two or more JARs. When this happens, only one
[WARNING] single version of the class is copied to the uber jar.
[WARNING] Usually this is not harmful and you can skip these warnings,
[WARNING] otherwise try to manually exclude artifacts based on
[WARNING] mvn dependency:tree -Ddetail=true and the above output.
[WARNING] See http://maven.apache.org/plugins/maven-shade-plugin/
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing /opt/spark-rm/output/spark-3.2.0-bin-hadoop2.7/launcher/target/spark-launcher_2.12-3.2.0.jar with /opt/spark-rm/output/spark-3.2.0-bin-hadoop2.7/launcher/target/spark-launcher_2.12-3.2.0-shaded.jar
[INFO] Dependency-reduced POM written at: /opt/spark-rm/output/spark-3.2.0-bin-hadoop2.7/launcher/dependency-reduced-pom.xml

Could you help fix it?

@sunchao
Copy link
Member Author

sunchao commented Sep 24, 2021

@gengliangwang Oops, I'm taking a look on this.

@sunchao
Copy link
Member Author

sunchao commented Sep 24, 2021

Yes this is the reason that last time I disabled the config. I think we are running into some Maven bug that is similar to this one. To fix it, I think we can put the newly introduced

    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>${hadoop-client-runtime.artifact}</artifactId>
      <version>${hadoop.version}</version>
      <scope>test</scope>
    </dependency>

in launcher/pom.xml to a hadoop-3.2 profile section. Tested locally and this is working. Let me see if other modules have similar issues.

@sunchao
Copy link
Member Author

sunchao commented Sep 24, 2021

I opened #34100 for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants