Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Oct 29, 2024

What changes were proposed in this pull request?

The pr aims to add name to RuleExecutor to make printing QueryExecutionMetrics's logs clearer.
Otherwise, the following printing is meaningless (without knowing that RuleExecutor's metric is being output)

24/10/29 15:12:33 WARN PlanChangeLogger:
=== Metrics of Executed Rules ===
Total number of runs: 100
Total time: 0.8585 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms

24/10/29 15:12:33 WARN PlanChangeLogger:
=== Metrics of Executed Rules ===
Total number of runs: 196
Total time: 0.78946 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms

Why are the changes needed?

There are many similar outputs printed in the log, but it seems difficult for spark developers to know which RuleExecutor generated them.

  • Before:
=== Metrics of Executed Rules ===
Total number of runs: 199
Total time: 1.394873 ms
Total number of effective runs: 2
Total time of effective runs: 0.916459 ms


=== Metrics of Executed Rules ===
Total number of runs: 196
Total time: 0.525134 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms


=== Metrics of Executed Rules ===
Total number of runs: 1
Total time: 0.00175 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms


=== Metrics of Executed Rules ===
Total number of runs: 166
Total time: 0.876414 ms
Total number of effective runs: 1
Total time of effective runs: 0.130166 ms


=== Metrics of Executed Rules ===
Total number of runs: 1
Total time: 0.007375 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms
  • After:
=== Metrics of Executed Rules org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2 ===
Total number of runs: 199
Total time: 32.982158 ms
Total number of effective runs: 2
Total time of effective runs: 32.067459 ms


=== Metrics of Executed Rules org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2 ===
Total number of runs: 196
Total time: 0.630705 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms


=== Metrics of Executed Rules org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer ===
Total number of runs: 1
Total time: 0.105459 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms


=== Metrics of Executed Rules org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1 ===
Total number of runs: 166
Total time: 2.308457 ms
Total number of effective runs: 1
Total time of effective runs: 1.22025 ms


=== Metrics of Executed Rules org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer ===
Total number of runs: 1
Total time: 0.009166 ms
Total number of effective runs: 0
Total time of effective runs: 0.0 ms

Does this PR introduce any user-facing change?

Yes, When Spark developers observe the logs printed by PlanChangeLogger#logMetrics, their meaning becomes clearer.

How was this patch tested?

Manually check

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Oct 29, 2024
@panbingkun panbingkun marked this pull request as ready for review October 29, 2024 07:21
@panbingkun
Copy link
Contributor Author

cc @cloud-fan @MaxGekk

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks helpful.

abstract class RuleExecutor[TreeType <: TreeNode[_]] extends Logging {

/** Name for this rule executor, automatically inferred based on class name. */
val name: String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about to convert it to a method with protected to allow override in sub-classes for better names.

Copy link
Contributor Author

@panbingkun panbingkun Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, let me update it now, thanks!

@MaxGekk
Copy link
Member

MaxGekk commented Nov 14, 2024

+1, LGTM. Merging to master.
Thank you, @panbingkun and @HyukjinKwon for review.

@MaxGekk MaxGekk closed this in 0aee601 Nov 14, 2024
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, late LGTM. Thank you, @panbingkun and all!

@panbingkun
Copy link
Contributor Author

Thanks all @MaxGekk @HyukjinKwon @dongjoon-hyun ❤️ !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants