-
Notifications
You must be signed in to change notification settings - Fork 51
use hadoop 2.8.0-palantir2 #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
In addition to the tests that ran as part of the build (https://circleci.com/gh/palantir/hadoop/156), I also ran the s3a tests on my laptop (pointed to an actual s3 bucket). The only failures were that you can overwrite a directory, which is expected based on having reverted HADOOP-13188. (The revert is hopefully temporary, but causes us to have the same behavior as we already have in 2.7.3, so not a regression.) |
|
Grr. Need to pull in Actual fix is probably on the Hadoop side, not the Spark side -- filed https://issues.apache.org/jira/browse/HDFS-11431. |
… the classes it relies on
|
Tried briefly to fix HDFS-11431, but it's a bit icky -- the |
|
Apparently I've now broken a bunch of hive tests. Can't tell whether this is the cause, but looks relevant: |
|
Hive test failures are evidently caused by inability to resolve the Hadoop version. |
|
Why the penultimate isn't a hit?
…On Sat, 25 Feb 2017 at 19:05, sjrand ***@***.***> wrote:
Hive test failures are evidently caused by inability to resolve the Hadoop
version.
2017-02-19 22:05:56.12 - stderr> [FAILED ] org.apache.hadoop#hadoop-client;2.8.0-palantir1!hadoop-client.jar: (0ms)
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> ==== local-m2-cache: tried
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> file:/home/ubuntu/spark/dummy/.m2/repository/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> ==== local-ivy-cache: tried
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> /home/ubuntu/.ivy2/local/org.apache.hadoop/hadoop-client/2.8.0-palantir1/jars/hadoop-client.jar
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> ==== central: tried
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.121 - stderr> https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
2017-02-19 <https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar2017-02-19> 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> ==== spark-packages: tried
2017-02-19 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
2017-02-19 <http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar2017-02-19> 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> ==== repo-1: tried
2017-02-19 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#107 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAfQVLFAxKudHlFeUMrVeis4fHidgvq1ks5rgG1NgaJpZM4MFg7G>
.
|
|
You need to add our bintray. Won't accept with failing tests
…On Sat, 25 Feb 2017 at 19:06, Robert Kruszewski ***@***.***> wrote:
Why the penultimate isn't a hit?
On Sat, 25 Feb 2017 at 19:05, sjrand ***@***.***> wrote:
Hive test failures are evidently caused by inability to resolve the Hadoop
version.
2017-02-19 22:05:56.12 - stderr> [FAILED ] org.apache.hadoop#hadoop-client;2.8.0-palantir1!hadoop-client.jar: (0ms)
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> ==== local-m2-cache: tried
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> file:/home/ubuntu/spark/dummy/.m2/repository/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> ==== local-ivy-cache: tried
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> /home/ubuntu/.ivy2/local/org.apache.hadoop/hadoop-client/2.8.0-palantir1/jars/hadoop-client.jar
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.12 - stderr> ==== central: tried
2017-02-19 22:05:56.12 - stderr>
2017-02-19 22:05:56.121 - stderr> https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
2017-02-19 <https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar2017-02-19> 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> ==== spark-packages: tried
2017-02-19 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
2017-02-19 <http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar2017-02-19> 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> ==== repo-1: tried
2017-02-19 22:05:56.121 - stderr>
2017-02-19 22:05:56.121 - stderr> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-client/2.8.0-palantir1/hadoop-client-2.8.0-palantir1.jar
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#107 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAfQVLFAxKudHlFeUMrVeis4fHidgvq1ks5rgG1NgaJpZM4MFg7G>
.
|
|
Yep, fair enough. Trying to find where to add it -- I'm a little confused because we already include |
e9e3871 to
5d590df
Compare
|
Woo, finally got a green build. I'm running another build now with the hadoop version bumped from 2.8.0-palantir1 to 2.8.0-palantir2, which picks up some upstream s3a fixes. If that build also passes, are you guys down to merge this? |
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-client</artifactId> | ||
| </dependency> | ||
| <dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't have this dependency then the resulting dist doesn't have hadoop-hdfs-2.8.0-palantir2.jar in it jars/ dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Something has changed in the packaging since with 2.7.3 it's there. Would be good to understand what happened upstream but not blocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's https://issues.apache.org/jira/browse/HDFS-6200. Previously hadoop-client would bring in hadoop-hdfs, but after that change, hadoop-client brings in hadoop-hdfs-client instead. But then because of https://issues.apache.org/jira/browse/HDFS-11431, I have to manually add hadoop-hdfs back in.
| hiveArtifacts.mkString(","), | ||
| SparkSubmitUtils.buildIvySettings( | ||
| Some("http://www.datanucleus.org/downloads/maven2"), | ||
| Some("http://dl.bintray.com/palantir/releases"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a list? (not sure what's the exact signature). Fine if not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently yes: @param remoteRepos Comma-delimited string of remote repositories other than maven central Will fix
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| <!-- TODO (srand) Remove this when https://issues.apache.org/jira/browse/HDFS-11431 is fixed --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create an issue as well and reference it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
couple of pom nits, otherwise 👍 |
|
@sjrand just to confirm, we expect a client running hadoop-2.8.0-palantir2 to work with a server of hadoop-2.7.3 ? And what about the reverse: a client of 2.7.3 against a hadoop-palantir server? |
|
Merging this PR caused this failure in the Circle test: I think it's a flake -- https://circleci.com/gh/palantir/spark/424 -- so kicked off another build. |
|
Yes, a Spark client application running 2.8.0-palantir2 against a 2.7* (or vendored equivalent) cluster has worked fine in my experience. MapReduce clients have not fared so well (classpath stuff), but that's not relevant here. There are no plans to run palantir-hadoop on the cluster -- no reason to try to play Hadoop vendor when several other companies already do a way better job of it than I could. |
* Change the API contract for uploading local jars. This mirrors similarly to what YARN and Mesos expects. * Address comments * Fix test
* Change the API contract for uploading local jars. This mirrors similarly to what YARN and Mesos expects. * Address comments * Fix test
@robert3005 @pwoody @ash211 for real this time
List of differences between upstream branch-2.8.0 and 2.8.0-palantir1: https://github.com/palantir/hadoop/blob/branch-2.8.0/PALANTIR-CHANGELOG.md