-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Add spilling metrics of SortMergeJoin #878
Conversation
def getNumRows: Int = if (rowAddresses == null) { | ||
0 | ||
} else { | ||
rowAddresses.size | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated. Just when I tested several memory configs to try to trigger spilling, this method throws NPE sometimes.
@@ -519,6 +519,8 @@ class CometExecSuite extends CometTestBase { | |||
assert(metrics("peak_mem_used").value > 1L) | |||
assert(metrics.contains("join_time")) | |||
assert(metrics("join_time").value > 1L) | |||
assert(metrics.contains("spill_count")) | |||
assert(metrics("spill_count").value == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spilling is not triggered. But it should propagate related metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also have a test that forces spilling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to do it locally, but not able trigger it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will keep trying but wanted to merge these metrics first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @viirya just wondering if those numbers taken from Spark context or from DF spilled metrics?
They are from DataFusion SortMergeJoin operator's metrics. Comet operators' metrics are propagated from DataFusion. |
Merged. Thanks @andygrove @comphead |
(cherry picked from commit f4400f5)
Which issue does this PR close?
Closes #.
Rationale for this change
We don't propagate spilling metrics of SortMergeJoin from DataFusion to Comet yet. This patch adds spilling related metrics to Comet SortMergeJoin operator.
What changes are included in this PR?
How are these changes tested?