Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cost model of aggregation should consider the agg type. #11948

Open
lzmhhh123 opened this issue Aug 30, 2019 · 6 comments
Open

Cost model of aggregation should consider the agg type. #11948

lzmhhh123 opened this issue Aug 30, 2019 · 6 comments
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@lzmhhh123
Copy link
Contributor

Feature Request

Is your feature request related to a problem? Please describe:

In pull request #11926. We found the coprocessor of tikv execute avg very similar with sum and count. But the cost model always regards its a more expensive plan. The reason is the cost model of aggregation only take the number of agg functions into consideration. So we should improve the cost model by taking care of the type of agg functions.

Describe the feature you'd like:

Improve the cost model of aggregation by taking care of the type of agg functions.

Describe alternatives you've considered:

None.

@lzmhhh123 lzmhhh123 added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner labels Aug 30, 2019
@eurekaka
Copy link
Contributor

Besides from Avg, is there any other kind of aggregation that would be split during execution?

@lzmhhh123
Copy link
Contributor Author

That's another problem. We can support split agg functions one by one, and resolve it step by step. Then add a rule to merge the same agg functions for a logicalAgg plan.

@jingyugao
Copy link
Contributor

How about the cost of sum(a)+sum(b) compared to sum(a+b)

@lzmhhh123
Copy link
Contributor Author

lzmhhh123 commented Aug 30, 2019

How about the cost of sum(a)+sum(b) compared to sum(a+b)

When thinking about the null column situation, this split may cause wrong results.

@breezewish
Copy link
Member

breezewish commented Sep 8, 2019

Does the cost model needs to be updated systematically since performance characters are very different in TiKV batch execution model compared to the old execution model?

@lzmhhh123
Copy link
Contributor Author

Does the cost model needs to be updated systematically since performance characters are very different in TiKV batch execution model compared to the old execution model?

That's necessary. We should take it into consideration in future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

4 participants