Skip to content

Conversation

@amaliujia
Copy link
Contributor

@amaliujia amaliujia commented Oct 7, 2022

What changes were proposed in this pull request?

  1. Add groupby to connect DSL and test more than one grouping expressions
  2. Pass limited data types through connect proto for LocalRelation's attributes.
  3. Cleanup unused Trait in the testing code.

Why are the changes needed?

Enhance connect's support for GROUP BY.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

@amaliujia
Copy link
Contributor Author

R: @cloud-fan

@amaliujia amaliujia changed the title [SPARK-40707] Add groupby to connect DSL and test more than one grouping expressions [SPARK-40707][CONNECT] Add groupby to connect DSL and test more than one grouping expressions Oct 7, 2022
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we add group by expression to aggregate expressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed the proto to make this clear.

@amaliujia
Copy link
Contributor Author

amaliujia commented Oct 10, 2022

@cloud-fan PR updated. PLAT.

Copy link
Contributor Author

@amaliujia amaliujia Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan I still keep this as AggregateFunction. proto.Expression is a too general type for now.

connect does not have a NamedExpression. I will follow up on this to improve.

This PR is to improve the grouping_expressions anyway

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

followup improvement SGTM. I don't think we even need AggregateFunction. The SQL parser usually just generate UnresolvedFunction, and the analyzer will look up the function and figure out if it's scalar/aggregate/window/table value function.

@amaliujia amaliujia force-pushed the support_more_than_one_grouping_set branch from cc48ed7 to 33f59ed Compare October 11, 2022 02:29
* This object offers methods to convert to/from connect proto to catalyst types.
*/
object TypeProtoConverter {
def toCatalystType(t: proto.Type): DataType = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we name it proto.DataType? And rename this object to DateTypeProtoConverter

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 4e4a848 Oct 11, 2022
@amaliujia amaliujia deleted the support_more_than_one_grouping_set branch October 11, 2022 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants