Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21708][BUILD] Migrate build to sbt 1.x #759

Merged
merged 1 commit into from
Apr 20, 2021

Conversation

LorenzoMartini
Copy link

@LorenzoMartini LorenzoMartini commented Apr 19, 2021

Original PR description

Migrate sbt-launcher URL to download one for sbt 1.x.
Update plugins versions where required by sbt update.
Change sbt version to be used to latest released at the moment, 1.3.13
Adjust build settings according to plugins and sbt changes.

Migration to sbt 1.x:

  1. enhances dev experience in development
  2. updates build plugins to bring there new features/to fix bugs in them
  3. enhances build performance on sbt side
  4. eases movement to Scala 3 / dotty

No.

All existing tests passed, both on Jenkins and via Github Actions, also manually for Scala 2.13 profile.

Closes apache#29286 from gemelen/feature/sbt-1.x.

Authored-by: Denis Pyshev git@gemelen.net
Signed-off-by: Dongjoon Hyun dhyun@apple.com

Upstream SPARK-XXXXX ticket and PR link (if not applicable, explain)

[SPARK-21708][BUILD] Migrate build to sbt 1.x
Commit: apache@6daa2ae
PR: apache#29286

What changes were proposed in this pull request?

Bump sbt version to 1.x.

There are a few non-trivial changes related to versions.
First of all the original PR was introduced on top of spark 3.1 so we had to adapt a few things for it to work on top of spark 3.0

  • [Important] The sbt bump introduces a mima bump and that introduces some binary breaks. If comparing the binary breaks with version 2.4 of spark, there are 200+ listed breaks. However comparing with 3.0, we get away with the breaks that are also added as excludes in the original PR (MimaExcludes.scala contains them) and only 3 additional breaks. According to what discussed in this comment from upstream it is correct for us to compare with version 3.0 and the upstream commit included this change here only because they forgot to do it after releasing 3.0. You can see they encountered the same problem upstream with this bump, from this comment.
    The breaks are listed under v30excludes instead of the upstream's v31excludes because current version is 3.0.0
  • [Workaround] The bump broke sbt unidoc. It attempts to generate docs for generated classes and fails with NoClassDef errors. Since we don't care about generated unidocs from sbt build, we can exclude the unidoc task for now.
  • [Small added detail] We have to avoid loading sbt/build from cache in circle builds because circle cached the existing file that points at old sbt location (bintray).
  • [Important note] I reduced circle parallelism from 12 to 8 for scala tests because there is an issue where too many parallel tests might interact one with the other and that causes the task to flake immensely. It seems like a lower parallelism reduces the chances of flaking at the expense of bringing the test time up from around 33 minutes to 42 minutes)

Why are the changes needed?

The SBT bump is needed because:

  • sbt 0.x (current) is only available in bintray, that is gonna shutdown in a matter of days
  • sbt 0.x introduces some test dependency on scala module 2.10 which is an issue when bumping avro version as 2.12 is needed.

Does this PR introduce any user-facing change?

No

@@ -28,9 +28,6 @@ all-branches-and-tags: &all-branches-and-tags
# Step templates

step_templates:
restore-build-binaries-cache: &restore-build-binaries-cache
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This had to be removed because it picks up older build/sbt and doesn't get the updated versions

@@ -390,7 +390,8 @@ def build_spark_assembly_sbt(extra_profiles, checkstyle=False):
if checkstyle:
run_java_style_checks(build_profiles)

build_spark_unidoc_sbt(extra_profiles)
# TODO(lmartini): removed because broken, checks generated classes
# build_spark_unidoc_sbt(extra_profiles)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need unidoc and this broke attempting to generate docs for autogenerated files. Didn't want to invest too much into this but can try if needed

val organization = "org.apache.spark"
val previousSparkVersion = "2.4.0"
val previousSparkVersion = "3.0.0"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is cleanly picked from upstream. It seems like they forgot to do this bump and added it with this PR out of convenience (context: apache#29286 (comment) and apache#22977 (comment))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the previous to 2.4.0 the number of binary breaks is 200+ so this is the best option anyways

// TODO(lmartini): Additional excludes not in upstream but unique to palantir fork
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.SparkContext.initializeForcefully"),
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.SparkContext.initializeForcefully"),
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.broadcast.Broadcast.initializeForcefully"),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These last 3 excludes had to be added in our fork only. It should be ok

dependencyOverrides ++= MavenHelper.fromPom { pom =>
for {
dep <- pom.getDependencyManagement.getDependencies.asScala
} yield MavenHelper.convertDep(dep)
}.value.toSet
}.value.toSeq
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sbt got stricter and toSet was not accepted

}.value
test := (test andFinally Def.taskDyn {
copyTestReportsToCircle
}).value
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, sbt got stricter. and the fancy tuple plugin stuff removed. This is the equivalent of the above


// need to make changes to uptake sbt 1.0 support in "com.cavorite" % "sbt-avro-1-7" % "1.1.2"
addSbtPlugin("com.cavorite" % "sbt-avro" % "0.3.2")
addSbtPlugin("com.cavorite" % "sbt-avro" % "2.1.1")
libraryDependencies += "org.apache.avro" % "avro-compiler" % "1.10.1"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upstream pr changed the version of avro compiler to 1.8.2 but we're already ahead of it

Migrate sbt-launcher URL to download one for sbt 1.x.
Update plugins versions where required by sbt update.
Change sbt version to be used to latest released at the moment, 1.3.13
Adjust build settings according to plugins and sbt changes.

Migration to sbt 1.x:
1. enhances dev experience in development
2. updates build plugins to bring there new features/to fix bugs in them
3. enhances build performance on sbt side
4. eases movement to Scala 3 / dotty

No.

All existing tests passed, both on Jenkins and via Github Actions, also manually for Scala 2.13 profile.

Closes apache#29286 from gemelen/feature/sbt-1.x.

Authored-by: Denis Pyshev <git@gemelen.net>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Comment on lines -126 to +128
lazy val sparkGenjavadocSettings: Seq[sbt.Def.Setting[_]] = Seq(
libraryDependencies += compilerPlugin(
"com.typesafe.genjavadoc" %% "genjavadoc-plugin" % unidocGenjavadocVersion.value cross CrossVersion.full),
lazy val sparkGenjavadocSettings: Seq[sbt.Def.Setting[_]] = GenJavadocPlugin.projectSettings ++ Seq(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is why the unidoc check started failing.

scalacOptions ++= Seq(
"-P:genjavadoc:out=" + (target.value / "java"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or this actually

Comment on lines -298 to +303
analysis.infos.allInfos.foreach { case (k, i) =>
i.reportedProblems foreach { p =>
val deprecation = p.message.contains("is deprecated")
analysis.asInstanceOf[sbt.internal.inc.Analysis].infos.allInfos.foreach { case (k, i) =>
i.getReportedProblems foreach { p =>
val deprecation = p.message.contains("deprecated")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This previously was is deprecated in our fork. But checking for deprecated only should give you a superset.)

@LorenzoMartini LorenzoMartini merged commit f779ffb into master Apr 20, 2021
@LorenzoMartini LorenzoMartini deleted the lmartini/sbt-1.x branch April 20, 2021 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants