Revert "[SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache" #33784

LuciferYang · 2021-08-19T05:21:45Z

What changes were proposed in this pull request?

This pr revert the change of SPARK-34309, includes:

Why are the changes needed?

No really performance improvement in Spark
Added an additional dependency

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass the Jenkins or GitHub Action

This reverts commit 281b00a.

…Guava Cache" This reverts commit 01cf6f4.

LuciferYang · 2021-08-19T05:28:15Z

cc @gatorsmile @HyukjinKwon @zsxwing @sarutak @sunchao @holdenk @mridulm @dongjoon-hyun and RemoteBlockPushResolver has some conflicts, I solved it manually. please help review ~

HyukjinKwon · 2021-08-19T05:39:17Z

hm I was thinking that we'd keep that fix since that's already merged.

sarutak · 2021-08-19T06:04:37Z

That PR was merged for 3.3.0 and we have time to release it. So, can we keep that change until we meet actual problem?

SparkQA · 2021-08-19T06:22:13Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47138/

SparkQA · 2021-08-19T07:17:16Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47138/

SparkQA · 2021-08-19T08:27:23Z

Test build #142637 has finished for PR 33784 at commit 756c905.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2021-08-19T16:05:01Z

I'm also negative about reverting this on master branch.

cc @sunchao , @viirya , @dbtsai

sunchao · 2021-08-19T16:38:36Z

+1 on keeping this. I think the main pushback is that this will leak dependency to downstream apps and cause conflicts, but we can figure out a way, such as shading, during the Spark 3.3 timeline. The same issue also applies to Guava today and we should tackle them together.

zsxwing · 2021-08-19T16:44:00Z

@LuciferYang Thank you for submitting the revert PR.

@sarutak @dongjoon-hyun As I mentioned in the previous PR, adding a popular library to Spark's classpath directly is risky. @LuciferYang 's suggestion to shade Caffeine is a good solution. But it sounds not worth because:

There is no evidence to show performance improvement in practice. The code path using cache is not on any hot path, and the difference between Guava Cache and Caffeine is probably not noticeable in practice.
Extra work to shade the following three libraries (Need to verify this approach is working):

caffeine/2.9.2//caffeine-2.9.2.jar
checker-qual/3.10.0//checker-qual-3.10.0.jar
error_prone_annotations/2.5.1//error_prone_annotations-2.5.1.jar

Increase the maintenance burden when backporting a patch that touches the changed code to other branches.

Do I miss any benefits provided by Caffeine?

holdenk · 2021-08-19T16:47:26Z

Maybe I'm missing some context, what's the reason for a revert to a slower implementation? Is it that there might be a conflict? Have you encountered this conflict with any projects? If anything reducing our scope on Guava (the cause of many conflicts) seems to be useful?

holdenk · 2021-08-19T16:50:39Z

Also wait, I'm under the impression that shuffle block fetching is in the hot-path, we have to block on fetches.
I'm (currently) against this change, but I'm super open to additional context (like if this is causing conflicts in practice).

zsxwing · 2021-08-19T17:05:24Z

I'm under the impression that shuffle block fetching is in the hot-path, we have to block on fetches.

The improvement doesn't sound noticeable since this is a RPC operation.

like if this is causing conflicts in practice

It's hard to check this before the release goes out. But since Caffeine is pretty popular (10.3k stars), the overlap of Spark and Caffeine users should not be small, and the chance to hit the conflict is not low.

SPARK-34309 also mentioned that

caffeine has been used in some open source projects like Cassandra, Hbase, Neo4j, Druid, Spring and so on.

Not sure whether they use it in the client or the server (It would be great if someone can help confirm). If they use it in the client, it may be a problem when people use the connector for these libraries with Spark.

holdenk · 2021-08-19T17:37:26Z

Ok looking at the datastax cassandra connector I'm not seeing caffeine in the direct dependencies - https://mvnrepository.com/artifact/com.datastax.oss/java-driver-core/4.13.0 , same thing with neo4j client https://mvnrepository.com/artifact/org.neo4j.driver/neo4j-java-driver/4.3.4

While I won't -1 this, I am -0 given the lack of an known conflict.

zsxwing · 2021-08-19T17:49:21Z

https://mvnrepository.com/artifact/com.github.ben-manes.caffeine/caffeine/usages says

Artifacts using Caffeine Cache (1,429)

https://mvnrepository.com/artifact/org.apache.spark/spark-sql/usages says

Artifacts using Spark Project SQL (1,436)

This suggests that Caffeine is as popular as Spark. I think the risk of conflict is pretty high.

holdenk · 2021-08-19T17:55:15Z

This suggests that Caffeine is as popular as Spark. I think the risk of conflict is pretty high.

Then it should be easy to find a case that matters :)

holdenk · 2021-08-19T17:58:42Z

Because, for comparison, we use antlr4 which has ~955 usages & bouncycastle with > 2.5k usages without problem.

zsxwing · 2021-08-19T18:11:32Z

Apache Solr is using Caffeine https://mvnrepository.com/artifact/org.apache.solr/solr-core/8.9.0 Sounds like a good example of potential conflict.

Anyway, we still cannot predict the Caffeine versions used by Spark applications directly since most of users won't publish their applications to maven central.

I'm definitely -1 for putting Caffeine on the Spark classpath directly. But +0 if other people think the extra work to shade the three libraries and the maintenance burden are the reasonable cost to pay.

holdenk · 2021-08-19T18:25:00Z

Apache Solr is using Caffeine https://mvnrepository.com/artifact/org.apache.solr/solr-core/8.9.0 Sounds like a good example of potential conflict.

I think it's good to distinguish between the client, which doesn't depend on solr-core or caffeine https://mvnrepository.com/artifact/org.apache.solr/solr-solrj/8.9.0 , and the server process (which seems like a less likely source of conflicts because I'd expect most people to be using the client).

Anyway, we still cannot predict the Caffeine versions used by Spark applications directly since most of users won't publish their applications to maven central.

Agreed, predicting conflicts from maven central is probably not the best way, but since you brought up maven statistics in your discussion I figured I'd point out that we use other libraries with more public usage than caffeine already.

I'm definitely -1 for putting Caffeine on the Spark classpath directly. But +0 if other people think the extra work to shade the three libraries and the maintenance burden are the reasonable cost to pay.

I'm not super sure how -1 on already committed but not released code works ( https://spark.apache.org/committers.html doesn't really call that out), but like I've said I'm only -0 on this revert.

That said, there have been other reservations expressed by other committers you should check in with around consensus before merging.

zsxwing · 2021-08-19T18:35:49Z

but like I've said I'm only -0 on this revert.

Thanks for your clarifying. Will wait for other committers' feedback.

rxin · 2021-08-19T19:11:28Z

Caffeine is definitely fairly popular (At Databricks, I actually had a project that used it but eventually removed it due to a timing bug we found; the bug was ultimately fixed but we had to remove it due to project deadline; Ben knows what I'm talking about here).

I'm confused here. Did we see perf improvement with Caffeine in any real scenarios? I saw some microbenchmarks in the original JIRA ticket saying there was improvements (and with some commentary from Ben saying the benchmark was not done correctly), but the pull request here claiming no improvement.

HeartSaVioR · 2021-08-20T01:52:15Z

Looks like performance benefit is something not everyone seems to be agreed with, then it cannot be the rationalization of introducing the new dependency.

In other perspective, I'd be happy to see the movement of gradually removing Guava usage (I expect that is the main reason other projects start to introduce Caffeine), "after" we build the consensus. If there's no strong agreement on the movement, nothing would happen at least for now, which is not wrong.

HyukjinKwon · 2021-08-20T02:31:57Z

I am fine with reverting if somebody feels strongly on this.

HyukjinKwon

I'm fine with this revert - it might better be conservative here after rethinking. Let me get this in if that's fine to all here.

HyukjinKwon · 2021-08-22T00:36:01Z

Merged to master.

But please let me know if you guys think differently. I am open to revert the revert.

LuciferYang added 3 commits August 19, 2021 12:47

Revert "[SPARK-34309][BUILD][FOLLOWUP] Upgrade Caffeine to 2.9.2"

57eece8

This reverts commit 281b00a.

Revert "[SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of …

f02abb6

…Guava Cache" This reverts commit 01cf6f4.

revert RemoteBlockPushResolver

756c905

github-actions bot added BUILD CORE KUBERNETES SQL labels Aug 19, 2021

LuciferYang mentioned this pull request Aug 19, 2021

[SPARK-36516][SQL] Support File Metadata Cache for ORC #33748

Closed

HyukjinKwon approved these changes Aug 21, 2021

View reviewed changes

HyukjinKwon closed this in 1ccb06c Aug 22, 2021

LuciferYang mentioned this pull request Aug 30, 2021

[SPARK-36598][SHUFFLE][SQL] Avoid the memory leak of Guava cache by add maximumWeight limit #33848

Closed

LuciferYang deleted the revert-caffeine branch October 22, 2023 07:34

Revert "[SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache" #33784

Revert "[SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache" #33784

Uh oh!

Conversation

LuciferYang commented Aug 19, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

LuciferYang commented Aug 19, 2021

Uh oh!

HyukjinKwon commented Aug 19, 2021

Uh oh!

sarutak commented Aug 19, 2021

Uh oh!

SparkQA commented Aug 19, 2021

Uh oh!

SparkQA commented Aug 19, 2021

Uh oh!

SparkQA commented Aug 19, 2021

Uh oh!

dongjoon-hyun commented Aug 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunchao commented Aug 19, 2021

Uh oh!

zsxwing commented Aug 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

holdenk commented Aug 19, 2021

Uh oh!

holdenk commented Aug 19, 2021

Uh oh!

zsxwing commented Aug 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

holdenk commented Aug 19, 2021

Uh oh!

zsxwing commented Aug 19, 2021

Uh oh!

holdenk commented Aug 19, 2021

Uh oh!

holdenk commented Aug 19, 2021

Uh oh!

zsxwing commented Aug 19, 2021

Uh oh!

holdenk commented Aug 19, 2021

Uh oh!

zsxwing commented Aug 19, 2021

Uh oh!

rxin commented Aug 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HeartSaVioR commented Aug 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Aug 20, 2021

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Aug 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

dongjoon-hyun commented Aug 19, 2021 •

edited

Loading

zsxwing commented Aug 19, 2021 •

edited

Loading

zsxwing commented Aug 19, 2021 •

edited

Loading

rxin commented Aug 19, 2021 •

edited

Loading

HeartSaVioR commented Aug 20, 2021 •

edited

Loading