feat: Add manual test to calculate spark builtin functions coverage #263

comphead · 2024-04-12T18:12:32Z

Which issue does this PR close?

Closes #.
Related to #240

Rationale for this change

Comet needs a tool to evaluate which spark builtin functions are currently supported by Comet with native execution

What changes are included in this PR?

The PR includes the manual test CometExpressionCoverageSuite which will create 2 files
doc/spark_coverage.txt, doc/spark_coverage_agg.txt containing current coverage situation

How are these changes tested?

comphead · 2024-04-12T18:18:55Z

files need to be excluded from RAT

sunchao

Nice, this is pretty neat!

doc/spark_coverage.txt

advancedxy

This is great work. Left two comments.

doc/spark_coverage.txt

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala

viirya · 2024-04-16T06:42:57Z

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala

+        "select result, d._2 as reason, count(1) cnt from (select name, t.details.result, explode_outer(t.details.details) as d from t) group by 1, 2"),
+      500,
+      0)
+    Files.write(Paths.get("doc/spark_coverage_agg.txt"), str_agg.getBytes(StandardCharsets.UTF_8))


Suggested change

Files.write(Paths.get("doc/spark_coverage_agg.txt"), str_agg.getBytes(StandardCharsets.UTF_8))

Files.write(Paths.get("doc/spark_agg_expr_coverage.txt"), str_agg.getBytes(StandardCharsets.UTF_8))

Oh, the file is not for aggregation expression. nvm.

But I think it may be better to have the aggregation of expression coverage in the same file?

viirya · 2024-04-16T06:46:25Z

doc/spark_coverage.txt

@@ -0,0 +1,421 @@
+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+|name                       |details                                                                                                                                                                                                                                                                                                                                                                                             |


I'm wondering if we can produce a markdown document instead of plain text. So it can be displayed in format. We can do it in follow up.

viirya · 2024-04-16T06:47:25Z

doc/spark_coverage_agg.txt

+|FAILED |Failed on native side                         |4  |
+|PASSED |OK                                            |254|
+|SKIPPED|null                                          |12 |
+|FAILED |Failed on something else. Check query manually|10 |


Hmm, I'm not sure I understand what this document represents?

SKIPPED means the indicator, there is no examples in the spark catalog, I'll add more meaningful message

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala

comphead · 2024-04-18T01:00:38Z

Something is wrong with TPC-DS Correctness it runs for 5 hours and stuck on downloadin Maven deps

advancedxy

LGTM, except some minor comments which could be addressed in followup PRs.

advancedxy · 2024-04-18T02:32:29Z

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala

+
+  import testImplicits._
+
+  private val rawCoverageFilePath = "doc/spark_builtin_expr_coverage.txt"


After a second thought, I think maybe we should add spark version in this file name. More functions will be added in new spark version. It might be helpful to indicate how many functions are supported in each Spark version.

This can be done in a follow up though

advancedxy · 2024-04-18T02:35:44Z

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala

+    Files.write(Paths.get(aggCoverageFilePath), str_agg.getBytes(StandardCharsets.UTF_8))
+
+    val str = showString(spark.sql("select * from t order by 1"), 1000, 0)
+    Files.write(Paths.get(rawCoverageFilePath), str.getBytes(StandardCharsets.UTF_8))


I second @viirya's point, #263 (comment). It would be better to put the aggregated result in the same file.

Ah, I see you filed a new issue. It could be addressed in a followup PR then.

viirya · 2024-04-19T05:23:07Z

doc/spark_builtin_expr_coverage_agg.txt

+|FAILED |Failed on native side                                   |16 |
+|FAILED |Failed on something else. Check query manually          |4  |
+|PASSED |OK                                                      |101|
+|SKIPPED|No examples found in spark.sessionState.functionRegistry|12 |


I'm thinking to have this aggregated summary in the expr coverage file too. It will be more readable. We can do it in making the coverage file as markdown format.

yes, I combined all of that in #282

…uite.scala

viirya · 2024-04-19T05:30:19Z

spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala

+      // disable constant folding. This optimization rule precompute and select value as literal
+      // which subsequently leads to false positives
+      //
+      // ConstantFolding is a operator optimization rule in Catalyst that replaces expressions
+      // that can be statically evaluated with their equivalent literal values.


This function doesn't always exclude ConstantFolding but passed in ones. This comment should be moved to where you pass ConstantFolding when you call testSingleLineQuery in CometExpressionCoverageSuite.

viirya

A few minor comments.

…pache#263)

feat: Add manual test to calculate spark builtin functions coverage

f47fb0d

comphead added 2 commits April 12, 2024 16:21

rat

6694d04

fmt

0b7257f

comphead requested review from viirya and sunchao April 13, 2024 04:54

sunchao approved these changes Apr 13, 2024

View reviewed changes

advancedxy reviewed Apr 15, 2024

View reviewed changes

doc/spark_coverage.txt Outdated Show resolved Hide resolved

advancedxy reviewed Apr 15, 2024

View reviewed changes

doc/spark_coverage.txt Outdated Show resolved Hide resolved

viirya reviewed Apr 16, 2024

View reviewed changes

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala Outdated Show resolved Hide resolved

viirya reviewed Apr 16, 2024

View reviewed changes

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala Outdated Show resolved Hide resolved

viirya reviewed Apr 16, 2024

View reviewed changes

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala Outdated Show resolved Hide resolved

viirya reviewed Apr 16, 2024

View reviewed changes

spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala Outdated Show resolved Hide resolved

comphead added 2 commits April 17, 2024 09:00

comments

88ad3e0

comments

286ffa2

This was referenced Apr 17, 2024

CI: Add spark expression coverage to build process #281

Open

feat: Make spark builtin expression coverage report more readable #282

Closed

comphead added 3 commits April 17, 2024 10:47

comments

bd1c1a4

tests

2518809

tests

7378d56

empty

322f69e

advancedxy approved these changes Apr 18, 2024

View reviewed changes

comphead requested a review from viirya April 18, 2024 15:32

viirya reviewed Apr 19, 2024

View reviewed changes

Update spark/src/test/scala/org/apache/comet/CometExpressionCoverageS…

71a189e

…uite.scala

viirya reviewed Apr 19, 2024

View reviewed changes

viirya approved these changes Apr 19, 2024

View reviewed changes

comphead added 2 commits April 19, 2024 15:10

comments

8daec68

Merge remote-tracking branch 'origin/dev' into dev

e163d74

viirya approved these changes Apr 19, 2024

View reviewed changes

sunchao merged commit b06794f into apache:main Apr 20, 2024
28 checks passed

himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024

feat: Add manual test to calculate spark builtin functions coverage (a…

39c961d

…pache#263)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add manual test to calculate spark builtin functions coverage #263

feat: Add manual test to calculate spark builtin functions coverage #263

comphead commented Apr 12, 2024

comphead commented Apr 12, 2024

sunchao left a comment •

edited

Loading

advancedxy left a comment

viirya Apr 16, 2024

viirya Apr 16, 2024

viirya Apr 16, 2024

comphead Apr 17, 2024

viirya Apr 16, 2024

comphead Apr 17, 2024

viirya Apr 16, 2024

comphead Apr 17, 2024

comphead commented Apr 18, 2024

advancedxy left a comment

advancedxy Apr 18, 2024

advancedxy Apr 18, 2024

advancedxy Apr 18, 2024

viirya Apr 19, 2024

comphead Apr 19, 2024

viirya Apr 19, 2024

viirya left a comment

	Files.write(Paths.get("doc/spark_coverage_agg.txt"), str_agg.getBytes(StandardCharsets.UTF_8))
	Files.write(Paths.get("doc/spark_agg_expr_coverage.txt"), str_agg.getBytes(StandardCharsets.UTF_8))

		@@ -0,0 +1,421 @@
		+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
		\|name \|details \|


		import testImplicits._

		private val rawCoverageFilePath = "doc/spark_builtin_expr_coverage.txt"

feat: Add manual test to calculate spark builtin functions coverage #263

feat: Add manual test to calculate spark builtin functions coverage #263

Conversation

comphead commented Apr 12, 2024

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

comphead commented Apr 12, 2024

sunchao left a comment • edited Loading

Choose a reason for hiding this comment

advancedxy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

comphead commented Apr 18, 2024

advancedxy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya left a comment

Choose a reason for hiding this comment

sunchao left a comment •

edited

Loading