-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: proposal a SQL planner based on the Volcano/Cascades model #7543
Conversation
@shenli @CaitinChen PTAL |
The physical optimization for the operators on the storage layer also suffers | ||
from the poor extensibility. In the present planner, we use "root" and "cop" | ||
task to distinguish the operators executed on TiDB and the storage layer, TiKV. | ||
The way to seperate a "cop" task is also not extensible, and "cop" tasks are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the extra space in "also not"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seperate -> separate
|
||
- **Pattern** | ||
|
||
Pattern describes a piece of a logical expression. It's a tree-like structure, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the different between Pattern
and Expression
? Expression
is also a tree-like structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pattern
is to describe a pattern of Expression
tree. It only concerns the type of Expression
node.
But there might be some scenarios that certain push-down rule can also be | ||
triggered after the second bottom-up traverse. In order to explore all | ||
optimization possibilities, the traverse on the groups should not be stopped | ||
until there is no rule can be matched: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could we ensure that it is convergent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add limitations like one rule cannot be applied multiple times on the same group.
|
||
At present, the optimization procedure of the planner is separated into two | ||
phases. The first phase, namely the "Logical Optimization", only applies the | ||
rules which always beneficial. The second phase, which is called the "Physical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rules which always beneficial -> rules which are always beneficial
or
rules which always beneficial -> the always-beneficial rules
subquery unfold, etc. | ||
|
||
Another drawback of the current planner is the poor extensibility. It's hard to | ||
add a new rule even if it's beneficial for all the scenarios: we have to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use a colon (:) here?
You can change it to "where", "because", or "so that" based on the text meaning.
|
||
Another drawback of the current planner is the poor extensibility. It's hard to | ||
add a new rule even if it's beneficial for all the scenarios: we have to | ||
consider the order of diffenent optimization rules carefully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diffenent -> different
|
||
The physical optimization for the operators on the storage layer also suffers | ||
from the poor extensibility. In the present planner, we use "root" and "cop" | ||
task to distinguish the operators executed on TiDB and the storage layer, TiKV. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
task -> tasks
from the poor extensibility. In the present planner, we use "root" and "cop" | ||
task to distinguish the operators executed on TiDB and the storage layer, TiKV. | ||
The way to seperate a "cop" task is also not extensible, and "cop" tasks are | ||
highly tied with "root" task. For exmple, we can only has a `Stream` aggregate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exmple -> example
has -> have
The fourth step is to adopt the "Adaptor" conception to rewrite the operator | ||
push-down logical for different storages. | ||
|
||
The fifth step is to add some rules which are not able or not easy to be added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fifth step is to add some rules which are not allowed or not easy to be added?
or
The fifth step is to add some rules which are not easy or cannot to be added?
The physical optimization for the operators on the storage layer also suffers | ||
from the poor extensibility. In the present planner, we use "root" and "cop" | ||
task to distinguish the operators executed on TiDB and the storage layer, TiKV. | ||
The way to seperate a "cop" task is also not extensible, and "cop" tasks are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seperate -> separate
|
||
The implementation rule is used to implement a logical expression operator to | ||
a physical operator. For example, with implementation rules, a logical `Join` | ||
operator can be implementated to `HashJoin`/`MergeJoin`/`IndexJoin`, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementated -> implemented
|
||
- **Operand** | ||
|
||
As disscussed above, the operand represents a logical expression operator. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disscussed -> discussed
@CaitinChen Done, thanks for your patient review! PTAL again. |
The physical optimization for the operators on the storage layer also suffers | ||
from the poor extensibility. In the present planner, we use "root" and "cop" | ||
tasks to distinguish the operators executed on TiDB and the storage layer, that | ||
is TiKV at the present. "cop" task are highly tied with "root" task, it's very |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is TiKV at present. The "cop" task is highly tied with the "root" task. It's very
at present = at the present time
from the poor extensibility. In the present planner, we use "root" and "cop" | ||
tasks to distinguish the operators executed on TiDB and the storage layer, that | ||
is TiKV at the present. "cop" task are highly tied with "root" task, it's very | ||
hard to push-down another operator to TiKV or supporting another storage engine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supporting -> support
|
||
- **Transformation Rule** | ||
|
||
The transform rule is used to transform a logical plan to another equivalent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transform -> transformation
} | ||
``` | ||
|
||
The child of `GroupExpr` is `Group`. There are many candicate child expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
candicate -> candidate
} | ||
``` | ||
|
||
At the very beginning, there is only one group expression in a `Group`, after |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the very beginning, there is only one group expression in a Group
. After
|
||
1. Adding a session variable named `tidb_enable_volcano_planner` to control | ||
whether to use the new planner. Once this variable is set, all the | ||
optimization steps are handed to the new planner. The procedure of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please delete the extra space between "." and "The".
for different storages. | ||
|
||
5. Adding some rules which are not easy or can not be added in the old planner | ||
to improve the performance on certain scenarios. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to improve the performance in certain scenarios.
@zz-jason My pleasure~ |
can be expressed to a tree-like structure and the child of an expression is | ||
also an expression. | ||
|
||
- **Expression Group**(or **Group**) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Expression Group (or Group)
I add a space before "(".
@tianjiqx You can send a issue for requesting this. This can be considered as a concrete optimization rule. |
@CaitinChen done, PTAL again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
Proposal a SQL planner based on the Volcano/Cascades model
What is changed and how it works?
Check List
Tests