Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Dec 7, 2022

What changes were proposed in this pull request?

This patch updates how SQLMetric merges two invalid instances where the value is both -1.

Why are the changes needed?

We use -1 as initial value of SQLMetric, and change it to 0 while merging with other SQLMetric instances. A SQLMetric will be treated as invalid and filtered out later.

While we are developing with Spark, it is trouble behavior that two invalid SQLMetric instances merge to a valid SQLMetric because merging will set the value to 0.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

@github-actions github-actions bot added the SQL label Dec 7, 2022
@viirya
Copy link
Member Author

viirya commented Dec 7, 2022

cc @sunchao @dongjoon-hyun

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@viirya
Copy link
Member Author

viirya commented Dec 7, 2022

Thank you @sunchao @dongjoon-hyun !

Pending on CI (not sure if any existing tests depend on current behavior).

@dongjoon-hyun
Copy link
Member

Could you re-trigger the failure CI test only?

@viirya
Copy link
Member Author

viirya commented Dec 8, 2022

I can reproduce the failed tests. Trying to fix them.

if (rowCount == 0) {
// For empty relation, the query stage doesn't serialize any bytes.
// The SQLMetric keeps initial value.
assert(stats.sizeInBytes == -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 It totally makes sense.

@dongjoon-hyun
Copy link
Member

All tests passed. Merged to master. Thank you, @viirya and all!

@viirya
Copy link
Member Author

viirya commented Dec 8, 2022

Thank you!

// The exchange doesn't serialize any bytes.
// The SQLMetric keeps initial value.
testMetricsInSparkPlanOperator(exchanges(1),
Map("dataSize" -> -1, "shuffleRecordsWritten" -> 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do we show in the SQL UI? printing -1 as the size is a bit weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -1 will be filtered out.

beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
…metric

### What changes were proposed in this pull request?

This patch updates how `SQLMetric` merges two invalid instances where the value is both -1.

### Why are the changes needed?

We use -1 as initial value of `SQLMetric`, and change it to 0 while merging with other `SQLMetric` instances. A `SQLMetric` will be treated as invalid and filtered out later.

While we are developing with Spark, it is trouble behavior that two invalid `SQLMetric` instances merge to a valid `SQLMetric` because merging will set the value to 0.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests.

Closes apache#38969 from viirya/minor_sql_metrics.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Dec 31, 2022
…it's invalid

### What changes were proposed in this pull request?

This is a followup of #38969 to fix a regression.

SQL UI is not the only way for end users to see the SQL metrics. They can also access the accumulator values in the physical plan programmatically via query execution listener. We should be consistent with the SQL UI and not expose the -1 value.

### Why are the changes needed?

make SQL UI and the accumulator value consistent

### Does this PR introduce _any_ user-facing change?

Yes, as end users can access accumulator values directly.

### How was this patch tested?

existing tests

Closes #39311 from cloud-fan/metric.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
…metric (apache#1643)

### What changes were proposed in this pull request?

This patch updates how `SQLMetric` merges two invalid instances where the value is both -1.

### Why are the changes needed?

We use -1 as initial value of `SQLMetric`, and change it to 0 while merging with other `SQLMetric` instances. A `SQLMetric` will be treated as invalid and filtered out later.

While we are developing with Spark, it is trouble behavior that two invalid `SQLMetric` instances merge to a valid `SQLMetric` because merging will set the value to 0.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests.

Closes apache#38969 from viirya/minor_sql_metrics.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants