Implement Underlying Grouping Sets #42631

AilinKid · 2023-03-28T05:35:24Z

Feature Request

Is your feature request related to a problem? Please describe:

Grouping Sets is internal implementation mechanism for supporting Multi-Distinct-Aggregate MPP Optimization and Rollup/Cube syntax.

SELECT a, b, sum(expression) FROM table GROUP BY a, b With Rollup;

For modern databases like Spark SQL, it allows user to explicitly describe wanted grouping sets explicitly like this:

SELECT a, b, sum(expression) FROM table GROUP BY a, b GROUPING SETS((a,b),(x,x),(...));

Different listed grouping sets/grouping layout requirement above will ask the underlying data to be expanded as multi copies to feed different requirement of Aggregation granularity. As a consequence, the leveled-aggregated result rows will be a union to user.

Apart from explicit requirement from sql syntax level, there is a another way to implicitly describe a composed grouping sets. That's what exactly rollup and cube syntax does. For more detail about a example like rollup(a,b,c), it has implicit N = 4 grouping sets derived from incremental expression composition, such as grouping sets (), (a),(a,b),(a,b,c), so does cube syntax which will be more complicated one.

For Multi Distinct Aggregate case like

select count(distinct a), count(distinct b) from t

distinct nature require a implement of aggregation on groups grouped by a or b here, while single one copy of data can't satisfied both grouping by a and grouping b synchronously. As a consequence, we resort to different grouping sets like (a) and (b) to ask the underlying data to be expanded to feed different aggregation vertically.

Both of the 3 cases above is dependent/based on the implementation of Grouping Sets and Expand Operator, so that's why this issue calls for.

Describe the feature you'd like:

Shown above.

Describe alternatives you've considered:

For Rollup Syntax workaround, rewrite the SQL as union of many sub-query with individual group by items.
For Multi Distinct Aggregate Optimization workaround, there is no way to migrate the computation task to multi mpp nodes.

Teachability, Documentation, Adoption, Migration Strategy:

related issues schedule

Underlying Grouping Sets and Expand Operator

planner, expression: support multi-distinct agg under MPP mode #39973 multi distinct aggregate planner side based on Expand1
executor: support the grouping sets tipb#283 grouping sets based Expand1 pb support
plan, executor: implement Expand operator for grouping sets tiflash#6545 multi distinct aggregate tiflash side based on Expand1
planner, executor: refactor expand logic from analyzing grouping sets to level projections tiflash#7169 leveled projections refactor of Expand2
executor: add pb support for leveled projection of expand2 tipb#300 leveled projection Expand2 pb support

Rollup Syntax

Add GroupingFunctionMetadata tipb#298 Grouping function pb support
Prepare grouping function for roll up #42463 Grouping function TiDB side implementation
Support the with rollup syntax for group clause #43427 Support Rollup syntax in TiDB parser side
grouping function should support multi args like grouping(a,b,c)... tiflash#7590 grouping function should support multi args like grouping(a,b,c)

Infra Support

expression: add simple expression semantic equal check logic #43558 Expression tree canonical semantic equal check
Support basic expression(grouping sets) transformation for Rollup syntax #44112 Support Rollup grouping sets transformation
expression: modify the grouping function to receive uint64 #44184 Modify the grouping function to receive uint64
rollup expand should support being converted to pb.expand2 #45179 Physical expand support expand-pb protocol V2

Plan Operator

planner: introduce the logical Expand for rollup syntax #44214 Introduce Logical Expand for rollup syntax
support grouping function/col/expression rewriting and physical plan exhaustion for rollup expand OP #44487 Support grouping function/col/expression rewriting and physical plan exhaustion for rollup expand OP

Bug Fix

function: fix grouping function result name tiflash#7718 function: fix grouping function result name
having item ref-ed to a grouping expression shouldn't be push down through expand #45647 having item ref-ed to a grouping expression shouldn't be push down through expand
order-by clause should be resolved to grouping set item #45593 order-by clause should be resolved to grouping set item
rollup aggregation report tiflash column name mismatch #45576 rollup aggregation report tiflash column name mismatch
expand: tpc-ds rollup related queries errors in nightly cluster tiflash#7830 expand: tpc-ds rollup related queries errors in nightly cluster
rollup expand should support being converted to pb.expand2 #45179 rollup expand should support being converted to pb.expand2
grouping function arg check logic errors in pre-only-full-group-check phase #45661 function: defer grouping function validation logic to expression rewriter.
orderby clause ref priority doesn't take effect #45797 orderby clause ref priority doesn't take effect
grouping function can't just be recreated with function name and args because it has additional metadata #45756 grouping function can't just be recreated with function name and args because it has additional metadata

The text was updated successfully, but these errors were encountered:

ref #42631

…implement expand executor (#54536) close #42631

close #42631

close pingcap#42631

AilinKid added type/feature-request Categorizes issue or PR as related to a new feature. sig/planner SIG: Planner sig/execution SIG execution labels Mar 28, 2023

AilinKid assigned AilinKid and xzhangxian1008 Mar 28, 2023

AilinKid unassigned xzhangxian1008 Apr 26, 2023

AilinKid mentioned this issue Sep 12, 2023

doc: add grouping sets doc #46906

Merged

12 tasks

AilinKid closed this as completed in #46906 Jan 8, 2024

AilinKid mentioned this issue Jul 2, 2024

planner: fix grouping sets document typo #54387

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this issue Jul 2, 2024

planner: fix grouping sets document typo (#54387)

fb132a9

ref #42631

AilinKid mentioned this issue Jul 15, 2024

planner, executor: allow build root task type of expand operator and implement expand executor #54536

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this issue Jul 24, 2024

planner, executor: allow build root task type of expand operator and …

cbe807f

…implement expand executor (#54536) close #42631

AilinKid mentioned this issue Jul 26, 2024

planner: import more expand test. #54962

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this issue Jul 26, 2024

planner: import more expand test. (#54962)

0abf3ae

close #42631

AilinKid mentioned this issue Jul 29, 2024

planner: import more tests about rollup expand #55024

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this issue Jul 30, 2024

planner: import more tests about rollup expand (#55024)

5d5de41

close #42631

hawkingrei pushed a commit to hawkingrei/tidb that referenced this issue Aug 1, 2024

planner: import more expand test. (pingcap#54962)

bd6e509

close pingcap#42631

hawkingrei pushed a commit to hawkingrei/tidb that referenced this issue Aug 1, 2024

planner: import more tests about rollup expand (pingcap#55024)

f023067

close pingcap#42631

Defined2014 mentioned this issue Aug 20, 2024

support GROUP BY modifiers #4250

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Underlying Grouping Sets #42631

Implement Underlying Grouping Sets #42631

AilinKid commented Mar 28, 2023 •

edited

Loading

Implement Underlying Grouping Sets #42631

Implement Underlying Grouping Sets #42631

Comments

AilinKid commented Mar 28, 2023 • edited Loading

Feature Request

Underlying Grouping Sets and Expand Operator

Rollup Syntax

Infra Support

Plan Operator

Bug Fix

AilinKid commented Mar 28, 2023 •

edited

Loading