Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-752] Build for many combos of Spark/Hadoop versions. #765

Merged
merged 3 commits into from
Aug 12, 2015

Conversation

fnothaft
Copy link
Member

Resolves build comments on #752. I have set Jenkins up as a 3D!!!!!! matrix:

screen shot 2015-08-10 at 4 32 28 pm

@fnothaft
Copy link
Member Author

OK, so a few interesting things.

  • We can't build for Spark 1.1.1, which I had forgotten (repartitionAndSortWithinPartitions doesn't exist)
  • It seems like the Spark-based unit tests will only run in parallel on Spark 1.2.0. Not sure what is broken in Spark 1.3+? I've told Jenkins to just run builds sequentially, for now.

@@ -26,16 +25,24 @@ export SPARK_DRIVER_MEMORY=8g

pushd "$ADAM_TMP_DIR"


if [[ $HADOOP_VERSION == "1.0.4" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe echo $HADOOP_VERSION | egrep '^1.0.' would be more general, in case, we use e.g. 1.0.4 as the hadoop test version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or possibly [[ $HADOOP_VERSION =~ ^1\.0 ]], likewise below, if that seems easier

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @ryan-williams said. :) Much cleaner.

@ryan-williams
Copy link
Member

That looks pretty neat. I'm guessing there's some config changes you are making in Jenkins that are not depicted here that e.g. populate $SPARK_VERSION?

tar xzvf spark-1.1.0-bin-hadoop1.tgz
export SPARK_HOME="${ADAM_TMP_DIR}/spark-1.1.0-bin-hadoop1"
HADOOP=hadoop1
elif [[ $HADOOP_VERSION == "2.6.0" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe echo $HADOOP_VERSION | egrep '^2.6.' would be more general, in case, we use e.g. 2.6.3 as the hadoop test version?

@massie
Copy link
Member

massie commented Aug 10, 2015

This is good stuff, Frank.

@fnothaft
Copy link
Member Author

Jenkins, retest this please.

@fnothaft
Copy link
Member Author

Fun times! Guess what doesn't work?

If your guess was that running:

mvn package -Dspark.version=1.4.1

Would lead to Destruction, Terror, and Mayhem, you win $5! At least, it does for me locally.

I will sort this out later, but if anyone has seen this before or has any clues, I would love your thoughts.

@heuermh
Copy link
Member

heuermh commented Aug 11, 2015

Destruction, Terror, and Mayhem and a 45 minute wait to see if your pull request turns green. :)

Maybe a middle ground, or set up a quick build in Travis and leave Jenkins 3D Awesome for a nightly integration test?

Would a profile work for where the -Dspark-version is failing?

@fnothaft
Copy link
Member Author

OK, so what seems to have been the problem is that we had a completely unused (???) dependency on com.amazonaws:aws-java-sdk, which was bringing in a version of com.fasterxml.jackson.core:jackson-core that was incompatible with Spark 1.3.0 and higher. As a result, this was giving a red herring test failure message that implied that the unit tests were crashing because we were running multiple SparkContexts in parallel. That unused dependency is gone now, so hopefully the build should pass now (and we don't need to run all the builds sequentially, which is good for obvious reasons).

@fnothaft
Copy link
Member Author

And the Spark 1.4.1/Hadoop 2.6.0 touchstone builds have passed! Huzzah! Now, for the rest of the builds.

@fnothaft
Copy link
Member Author

Rebased and added a commit to clean up log junk when running 1.4.1 unit tests.

@fnothaft fnothaft force-pushed the update-jenkins-script branch 2 times, most recently from 23b9e41 to bbdc3ce Compare August 11, 2015 17:52
@fnothaft
Copy link
Member Author

Cleaned up RE: the comments above around version checking.

@fnothaft
Copy link
Member Author

Jenkins, retest this please.

@@ -345,12 +345,36 @@
<type>test-jar</type>
<scope>test</scope>
</dependency>
<!--
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for this block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was testing something; this can be removed.

@fnothaft
Copy link
Member Author

Jenkins, retest this please.

@fnothaft
Copy link
Member Author

IT WORKS! IT WORKS! IT REALLY DOES!!!!!!!
;)

@heuermh
Copy link
Member

heuermh commented Aug 12, 2015

Are all the red Failed - skipped in build 842 supposed to be Not run? Looks like they might be combinations that are not supported, such as Hadoop 1 with Spark 1.4.1. Or maybe I just need my 3D glasses?

@fnothaft
Copy link
Member Author

@heuermh correct; Spark 1.4.1/Hadoop 1.x is skipped as it won't build with the current way that Spark is packaged as maven artifacts (there was a long discussion of this on a previous ADAM pr).

@fnothaft
Copy link
Member Author

They show up as greyed out red because the last build of that combo failed.

@heuermh
Copy link
Member

heuermh commented Aug 12, 2015

From what I can see it looks like a 15 minute build time up from 10 minutes, not too bad. +1

@massie massie merged commit 6cade81 into bigdatagenomics:master Aug 12, 2015
@massie
Copy link
Member

massie commented Aug 12, 2015

Nice addition, Frank!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants