-
Notifications
You must be signed in to change notification settings - Fork 167
[#2700] feat(spark): Eager shuffle deletion #2704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #2704 +/- ##
=============================================
+ Coverage 0 50.91% +50.91%
- Complexity 0 3271 +3271
=============================================
Files 0 533 +533
Lines 0 25526 +25526
Branches 0 2318 +2318
=============================================
+ Hits 0 12996 +12996
- Misses 0 11700 +11700
- Partials 0 830 +830 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| import org.apache.spark.scheduler.{SparkListener, SparkListenerStageCompleted, SparkListenerStageSubmitted} | ||
| import org.apache.uniffle.shuffle.manager.RssShuffleManagerBase | ||
|
|
||
| class UniffleStageDependencyListener extends SparkListener with Logging { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listener‘s event will be lost sometimes, especially there are huge events.
What changes were proposed in this pull request?
Introduce the eager shuffle deletion mode to explicitly reduce the whole cluster shuffle storage capacity.
Why are the changes needed?
for the issue #2700
Does this PR introduce any user-facing change?
Yes. the option is introduced to enable this feature
spark.rss.client.eagerShuffleDeletion.enabled=falseHow was this patch tested?
Unit tests