-
Notifications
You must be signed in to change notification settings - Fork 286
Closed
Description
Describe the bug
I invoked the following in the spark-shell, version 3.5.6.
$SPARK_HOME/bin/spark-shell $COMET/spark/target/comet-spark-spark3.5_2.12-0.10.0-SNAPSHOT --conf spark.plugins=org.apache.spark.CometPlugin --conf spark.comet.enabled=true --conf spark.comet.exec.enabled=true
import org.apache.spark.sql.types._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.Row
val schema = StructType(Seq(StructField("id", IntegerType, nullable = false), StructField("value", IntegerType, nullable = false)))
val data = Seq(Row(1, 10), Row(2, 20), Row(3, 10), Row(4, 30), Row(5, 20), Row(6, 10))
val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)
val out = "groupby"
df.write.mode("overwrite").parquet(out)
val parquetDF = spark.read.parquet(out)
val grouped = parquetDF.groupBy("id").count()
grouped.explain()
And saw that the explain output has CometHashAggregate prefixed with !.
== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- HashAggregate(keys=[id#137], functions=[count(1)])
+- Exchange hashpartitioning(id#137, 4), ENSURE_REQUIREMENTS, [plan_id=420]
+- !CometHashAggregate [id#137], Partial, [id#137], [partial_count(1)]
+- CometNativeScan parquet [id#137] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/home/testing/groupby], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int>
There was no correctness/failures issues.
The spark codebase here says - they use "!" to indicate an invalid plan, and "'" to indicate an unresolved plan. However, I haven't verified, if this is the only place in the code from where ! gets added to the plan.
I filed this issue after seeing that comment in the spark code to bring this to notice.
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working