-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-45292][SQL][HIVE] Remove Guava from shared classes from IsolatedClientLoader #42599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I think the Guava version needs to be a function of version, rather than matching Spark's Guava version or omitting Guava: if we remove Guava from shared classes but also changed the now-non-shared metastore Guava version then it's the same net effect and breaks older Hive versions.
I believe that newer versions of Hive shade Guava (in which case they're insensitive to whatever value we set here). I think we could either (a) continue to unconditionally use Guava 14.0.1 here, or (b) conditionally use it only for older Hive versions that predated the Guava shading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we remove Guava from shared classes but also changed the now-non-shared metastore Guava version then it's the same net effect and breaks older Hive versions.
Hmm, I don't get the idea. After removing Guava from the shared class, we don't need to keep Hive and Spark using the same Guava. And we did not add Guava to HiveVersion#exclusions, each Hive version should automatically pull all transitive dependencies including Guava if necessary.
Anyway, let's simply try (a) first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoshRosen Unfortunately, it does not work.
HiveClientSuite.0.13: createDatabase
Caused by: sbt.ForkMain$ForkError: java.lang.AssertionError: java.lang.NoSuchMethodException: com.google.common.base.internal.Finalizer.startFinalizer(java.lang.Class, java.lang.Object)
at com.google.common.base.FinalizableReferenceQueue.getStartFinalizer(FinalizableReferenceQueue.java:283)
at com.google.common.base.FinalizableReferenceQueue.<clinit>(FinalizableReferenceQueue.java:86)
at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:427)
at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:501)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:298)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, cc @sunchao, I think the experiment indicates that we can not upgrade Guava while keeping IsolatedClientLoader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoshRosen @sunchao do you have any suggestions for this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
things changed after we removed support for Hive prior 2.0.0, we can remove Guava from the shared class list now, thus we can do Guava upgrading while keeping IsolatedClientLoader
8dae326 to
0afd3cc
Compare
|
cc @wangyum @LuciferYang too |
This reverts commit 75e3c367050d9c4e3a0a96efbeecaae1beaa44ca.
0afd3cc to
537b1df
Compare
|
@JoshRosen @sunchao @wangyum I think no known technical issues are blocking us from upgrading Guava while keeping |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I agree with @pan3793 . It looks fine to me.
Let me merge this as a part of Apache Spark 4.0.0.
Since the change is tiny, we can revert easily if there is any regression report during QA period.
|
Merged to master. |
|
@dongjoon-hyun thank you, will continue Guava upgrading works once Hive 2.3.10 is available |
Looking forward to the release of Hive 2.3.10 :) |
|
Thank you! cc @sunchao for Apache Hive 2.3.10 release~ |
|
Is there any update for Apache Hive 2.3.10, @pan3793 ? |
@dongjoon-hyun we have made several patches into branch-2.3 after 2.3.9 https://github.com/apache/hive/commits/branch-2.3/ seems there are some expected patches were not get in
kindly ping @sunchao, do you have ETA for 2.3.10? and please let me know what can I do for this release. |
|
Hey @pan3793 @dongjoon-hyun , sorry for the delay on Hive 2.3.10. I plan to start working on the release in the next few days. |
|
Thank you always, @sunchao ! |
What changes were proposed in this pull request?
Try removing Guava from
sharedClassesas suggested by @JoshRosen in #33989 (comment) and #42493 (comment)Why are the changes needed?
Unblock Guava upgrading.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
CI passed (embedded HMS) and verified in the internal YARN cluster (remote HMS with kerberos-enabled).
Was this patch authored or co-authored using generative AI tooling?
No.