-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-33090][BUILD][test-hadoop2.7] Upgrade Google Guava to 29.0-jre #30022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Compatible with Hadoop > 3.2.0 * Future proof for a while
|
FYI @dongjoon-hyun @srowen |
Hi @AngersZhuuuu , I'm not sure I see any benefit in that. It will increase the complexity of an already complicated build system. It's significantly more than just a version number change. If you look at the changed files you will see what I mean. Complexity is the enemy of maintainability. |
|
The big big problem here is that previous Hadoop versions (<= 3.2.0) use Guava 14 or so, so this would break some compatibility with them. I think it could only happen for a Hadoop 3.2.1+ profile, but, there may be a good idea. |
|
ok to test |
|
Thanks, @sfcoy . This is an interesting approach. |
|
BTW, @sfcoy and @AngersZhuuuu . For the following, Apache Spark community wants to use the official Hadoop 3 client to cut the dependency dramatically.
Please see here.
After SPARK-29250, I guess this PR will be a general Guava version upgrade PR without any relation to cc @sunchao |
IMO, if spark-3 with hadoop3.2 can work well in Hadoop cluster (2.6/2/7/2.8. etc), it's ok just use hadoop 3.2 client. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #129752 has finished for PR 30022 at commit
|
|
Not sure if this works well with Hive 2.3.x also since it is still on Guava 14.0.1.
Yes it's expected to work. There is an issue HDFS-15191 which potentially breaks compatibility between Hadoop 3.2.1 and 2.x server but it is fixed in 3.2.2 (which Spark is probably going to use). |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
The Kubernetes integration test appears to be running out of disk space:
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
Upgrade the Google Guava dependency for compatibility with Hadoop 3.2.1 and Hadoop 3.3.0.
Why are the changes needed?
Spark fails at runtime with NoSuchMethodExceptions when built/run in conjunction with these versions, which make use of com.google.common.base.Preconditions methods that are not present in the version of Guava currently specified for Spark.
Does this PR introduce any user-facing change?
This change introduces new dependencies into the build which are imported by the guava pom file.
How was this patch tested?
We are currently running ETL production processes using Spark builds with this Guava version (based on the 3.0.1 tag).