Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Support stream aggregation in new plan #4481

Merged
merged 15 commits into from
Sep 14, 2017
Merged

*: Support stream aggregation in new plan #4481

merged 15 commits into from
Sep 14, 2017

Conversation

zimulala
Copy link
Contributor

@zimulala zimulala commented Sep 8, 2017

support stream aggregation in the new plan and clean the tests.

@@ -380,7 +380,7 @@ func (s *testPlanSuite) TestDAGPlanBuilderSubquery(c *C) {
// Test Nested sub query.
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test for stream agg + index join and stream agg + merge join and stream agg + limit and stream agg + sort

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// groupByCols stores the columns that are group-by items.
groupByCols []*expression.Column

possibleProperties [][]*expression.Column
childCount float64 // childCount is the child plan's count.
cardinality float64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cardinality is always equal to LogicalAggregation's profile's count.

// groupByCols stores the columns that are group-by items.
groupByCols []*expression.Column

possibleProperties [][]*expression.Column
childCount float64 // childCount is the child plan's count.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe inputCount is a better name. childCount seems to indicate the number of children.

@@ -1090,26 +1090,80 @@ func (p *TopN) getChildrenPossibleProps(prop *requiredProp) [][]*requiredProp {
return props
}

func (p *LogicalAggregation) getStreamAggs() []PhysicalPlan {
for _, aggFunc := range p.AggFuncs {
if aggFunc.GetMode() == expression.FinalMode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems never be touched ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do decompose will have FinalMode aggregation.

return nil
}
}
// group by a + b is not interested in any order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test for this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

plan/task.go Outdated
}
}
task = finishCopTask(cop, p.ctx, p.allocator)
task.addCost(task.count()*cpuFactor + p.cardinality*hashAggMemFactor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use p.statsProfile().count to replace p.cardinality

plan/stats.go Outdated
@@ -81,7 +81,6 @@ func (p *DataSource) getStatsProfileByFilter(conds expression.CNFExprs) *statsPr
}
selectivity, err := p.statisticTable.Selectivity(p.ctx, conds)
if err != nil {
log.Warnf("An error happened: %v, we have to use the default selectivity", err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this log ?


reqProp := &requiredProp{taskTp: rootTaskType, cols: p.propKeys, expectedCnt: prop.expectedCnt * p.childCount / p.profile.count}
if !prop.isEmpty() {
if prop.desc {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Desc can also pass

agg.SetSchema(p.schema.Clone())
agg.profile = p.profile
aggs = append(aggs, agg)
if len(p.possibleProperties) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove this check.

func (p *LogicalAggregation) generatePhysicalPlans() []PhysicalPlan {
ha := PhysicalAggregation{
aggs := make([]PhysicalPlan, 0, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 is not reasonable, len(p.possibleProperties) + 1 is the upper bound.

@pingcap pingcap deleted a comment from hanfei1991 Sep 8, 2017
@shenli
Copy link
Member

shenli commented Sep 9, 2017

@hanfei1991 PTAL

},
{
sql: "select count(*) from t where e > 1 group by b",
best: "TableReader(Table(t)->Sel([gt(test.t.e, 1)])->HashAgg)->HashAgg",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use index on e for filtering or use index on b for streaming aggregation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using index will cause a double read.

@hanfei1991
Copy link
Member

LGTM

@zimulala zimulala added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 12, 2017
Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

best: "IndexLookUp(Index(t.b_c)[[-inf <nil>,20 +inf]], Table(t)->HashAgg)->HashAgg",
},
{
sql: "select count(e) from t where t.b <= 30",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the case that t.b <= 30 and t.b <= 40 is not necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are old tests. @hanfei1991

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh

@winoros
Copy link
Member

winoros commented Sep 13, 2017

/run-all-test

@zimulala
Copy link
Contributor Author

PTAL @winoros

Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@winoros winoros added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 14, 2017
@XuHuaiyu XuHuaiyu merged commit d0be70d into master Sep 14, 2017
@XuHuaiyu XuHuaiyu deleted the zimuxia/stream-agg branch September 14, 2017 06:22
@zimulala zimulala added the sig/planner SIG: Planner label Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants