Bump Spark version to 1.4 #752

laserson · 2015-08-04T22:16:54Z

This issue will track any progress necessary for that.

See #750.
See #659.

laserson · 2015-08-04T22:20:59Z

It seems spark.kryoserializer.buffer.mb has been deprecated.

laserson · 2015-08-04T22:52:25Z

See #751 for a change that needs to be included.

laserson · 2015-08-04T23:00:57Z

Correct

ryan-williams · 2015-08-05T17:41:16Z

So, I realized something awkward about this that maybe others have processed but that I hadn't yet: there's spark.version in ADAM's POM, and then there's whatever spark version the user's $SPARK_HOME is, and they're pretty much independent.

The former is only relevant to the spark classes we link against, which AFAIK have not changed in a way we care about since 1.2, and so bumping it or not, in isolation, shouldn't make any difference to anyone.

The latter informs what scripts (e.g. Spark's bin/utils.sh) are available vs. not; it definitely seems like high time to allow $SPARK_HOME to point at a Spark >= 1.4, but there's a question of how/whether to also support $SPARK_HOME pointing at Sparks < 1.4, which is currently being discussed on #754, I guess?

heuermh · 2015-08-05T17:54:03Z

Yep, the compile time dependency shouldn't matter, unless it does. :) E.g. if a binary incompatibility in 1.x versions of Spark or one of its transitive dependencies slips through.

Is #754 backward compatible script-wise with previous versions of Spark? A few simple examples I tried worked for me on Spark 1.3.1. It would be best if we didn't require $SPARK_HOME to be set at all, I think it was only required to find the utils.sh script.

ryan-williams · 2015-08-05T17:57:47Z

Good point about transitive dependencies.

If the net result of #754 ends up being that we don't need $SPARK_HOME set and the ADAM scripts work against arbitrary Spark versions >= 1.2, then that seems like a great step forward. I'll try to keep following along there.

fnothaft · 2015-08-05T18:03:52Z

The transitive dependency thing is both nasty and common unfortunately...

laserson · 2015-08-05T19:17:03Z

SPARK_HOME is also needed to find the spark-submit, spark-shell, and pyspark scripts.

heuermh · 2015-08-05T19:24:24Z

SPARK_HOME is also needed to find the spark-submit, spark-shell, and pyspark scripts.

Would it be a fair assumption that those should be on the user's path?

laserson · 2015-08-05T19:25:41Z

Probably depends on the user. It never is for me, bc I often use different versions of Spark. One option would be to check if SPARK_HOME is set, and if not simply try for whatever is on the path.

heuermh · 2015-08-05T19:38:38Z

One option would be to check if SPARK_HOME is set, and if not simply try for whatever is on the path.

+1

ryan-williams · 2015-08-10T17:55:15Z

Ah yea, I don't have spark-{submit,shell} on my $PATH either since I frequently switch Sparks.

Checking both sgtm2.

Is this done now that #754 is in?

Should we bump the POM version? I can file a separate issue for that if necessary; I was just noting while doing README refactoring in #763 / #764 that we say we continuous-build against Spark 1.1.0, our POM says Spark 1.2.0, and we actually support up to Spark 1.4.1 (and likely soon 1.5.0). Any defrag'ing we should do about those?

fnothaft · 2015-08-10T18:05:11Z

Well, the POM is 1.4.1 now, right? How about we move CI to 1.4.1 as well? That seems like the simplest solution. I'll prep a PR.

ryan-williams · 2015-08-10T18:06:30Z

Well, the POM is 1.4.1 now, right?

Nope! 1.2.0. Unless I am really out of it.

fnothaft · 2015-08-10T18:09:08Z

You are correct, nevermind!

I am OK with having the Jenkins sanity test script scripts/jenkins-test check multiple versions of Spark.(e.g., 1.2.1, 1.3.1, 1.4.1) That might be a good idea anyways.

ryan-williams · 2015-08-10T18:19:56Z

Having jenkins test a matrix of Sparks sgtm @fnothaft.

I was going to just bump the Spark version in the POM via github's web-edit-file flow, but then I remembered your warnings about transitive deps above. Do you have some system for evaluating the danger of such an upgrade?

fnothaft · 2015-08-10T18:36:26Z

Do we want to matrix test Spark at both the build and the executable level? I am OK with either.

I was going to just bump the Spark version in the POM via github's web-edit-file flow, but then I remembered your warnings about transitive deps above. Do you have some system for evaluating the danger of such an upgrade?

The jenkins-test script, which alas, currently only tests a single version of Spark... How about I write an enhancement to our Jenkins flow, and then we merge that, and then we bump the POM?

ryan-williams · 2015-08-10T18:54:40Z

Do we want to matrix test Spark at both the build and the executable level? I am OK with either.

"Build level": basically run mvn package?
"Executable level": run some ADAM commands, or?

Upgrading jenkins-test then bumping POM and updating docs sgtm.

Other random question: now that SPARK-8057 is in, will we be able to support Hadoop 1 again in Spark 1.5.0? Do we care / want to? I was just noticing Hadoop-1-specific logic in jenkins-test.

fnothaft · 2015-08-10T19:00:52Z

"Build level": basically run mvn package?
"Executable level": run some ADAM commands, or?

Exactly!

Upgrading jenkins-test then bumping POM and updating docs sgtm.

+1

Other random question: now that SPARK-8057 is in, will we be able to support Hadoop 1 again in Spark 1.5.0? Do we care / want to? I was just noticing Hadoop-1-specific logic in jenkins-test.

Since spark-ec2 is Hadoop 1 centric, I'd like to keep testing hooks in that ensure that people can run adam on top of those scripts. I think we would have to waive the Hadoop 1/Spark 1.4.1 combo, but otherwise should be OK. That exclusion is straightforward in Jenkins.

heuermh · 2015-08-10T19:44:11Z

"Build level": basically run mvn package?
"Executable level": run some ADAM commands, or?

Exactly!

Right, some of the potential binary incompatibility issues with transitive dependencies won't show up at build time, and there is a possibility that the classpath in test scope could be different than runtime.

heuermh · 2015-08-19T15:43:53Z

If I have everything right:

#750 has been closed.
I believe #659 can be closed.
Pull request #753 was merged and then rolled back.
Additional Jenkins builds (and another transitive dep fix) were added in #765.
New pull request #778 re-applies change to Spark compile time dependency to version 1.4.1.

ryan-williams · 2015-08-19T16:00:01Z

That looks right, I just closed #659

fnothaft · 2015-08-19T16:43:29Z

Closed by 7e8eb05.

laserson changed the title ~~Bump Spark version so 1.4~~ Bump Spark version to 1.4 Aug 4, 2015

laserson mentioned this issue Aug 4, 2015

[ADAM-750] Exclude more sources of Jackson dependency conflicts #751

Closed

laserson closed this as completed in 75692de Aug 4, 2015

laserson reopened this Aug 5, 2015

fnothaft mentioned this issue Aug 10, 2015

[ADAM-752] Build for many combos of Spark/Hadoop versions. #765

Merged

fnothaft closed this as completed Aug 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump Spark version to 1.4 #752

Bump Spark version to 1.4 #752

laserson commented Aug 4, 2015

laserson commented Aug 4, 2015

laserson commented Aug 4, 2015

laserson commented Aug 4, 2015

ryan-williams commented Aug 5, 2015

heuermh commented Aug 5, 2015

ryan-williams commented Aug 5, 2015

fnothaft commented Aug 5, 2015

laserson commented Aug 5, 2015

heuermh commented Aug 5, 2015

laserson commented Aug 5, 2015

heuermh commented Aug 5, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

heuermh commented Aug 10, 2015

heuermh commented Aug 19, 2015

ryan-williams commented Aug 19, 2015

fnothaft commented Aug 19, 2015

Bump Spark version to 1.4 #752

Bump Spark version to 1.4 #752

Comments

laserson commented Aug 4, 2015

laserson commented Aug 4, 2015

laserson commented Aug 4, 2015

laserson commented Aug 4, 2015

ryan-williams commented Aug 5, 2015

heuermh commented Aug 5, 2015

ryan-williams commented Aug 5, 2015

fnothaft commented Aug 5, 2015

laserson commented Aug 5, 2015

heuermh commented Aug 5, 2015

laserson commented Aug 5, 2015

heuermh commented Aug 5, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

ryan-williams commented Aug 10, 2015

fnothaft commented Aug 10, 2015

heuermh commented Aug 10, 2015

heuermh commented Aug 19, 2015

ryan-williams commented Aug 19, 2015

fnothaft commented Aug 19, 2015