-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34713][SQL] Fix group by CreateStruct with ExtractValue #31808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1091,6 +1091,25 @@ class DataFrameAggregateSuite extends QueryTest | |
| val df = spark.sql(query) | ||
| checkAnswer(df, Row(0, "0", 0, 0) :: Row(-1, "1", 1, 1) :: Row(-2, "2", 2, 2) :: Nil) | ||
| } | ||
|
|
||
| test("SPARK-34713: group by CreateStruct with ExtractValue") { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for adding this test case. BTW, do you think we can have a narrow-downed test case in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't have a UT for |
||
| val structDF = Seq(Tuple1(1 -> 1)).toDF("col") | ||
| checkAnswer(structDF.groupBy(struct($"col._1")).count().select("count"), Row(1)) | ||
|
|
||
| val arrayOfStructDF = Seq(Tuple1(Seq(1 -> 1))).toDF("col") | ||
| checkAnswer(arrayOfStructDF.groupBy(struct($"col._1")).count().select("count"), Row(1)) | ||
|
|
||
| val mapDF = Seq(Tuple1(Map("a" -> "a"))).toDF("col") | ||
| checkAnswer(mapDF.groupBy(struct($"col.a")).count().select("count"), Row(1)) | ||
|
|
||
| val nonStringMapDF = Seq(Tuple1(Map(1 -> 1))).toDF("col") | ||
| // Spark implicit casts string literal "a" to int to match the key type. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, since line 1102 and 1105 works and the following works, we don't care the field name at all? scala> Seq(Tuple1(Map("b" -> "b"))).toDF("col").groupBy(struct($"col.a")).count().select("count").show
+-----+
|count|
+-----+
| 1|
+-----+
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cc @maropu , too.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I was confused here. Thanks. |
||
| checkAnswer(nonStringMapDF.groupBy(struct($"col.a")).count().select("count"), Row(1)) | ||
|
|
||
| val arrayDF = Seq(Tuple1(Seq(1))).toDF("col") | ||
| val e = intercept[AnalysisException](arrayDF.groupBy(struct($"col.a")).count()) | ||
| assert(e.message.contains("requires integral type")) | ||
| } | ||
| } | ||
|
|
||
| case class B(c: Option[Double]) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, do we have a valid case for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.
col[1]to get the map value. It's not a singleUnresolvedAttribute(multi-part name likea.b.c) and is unrelated to this bug fix