-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expression: propagate more filters in PropagateConstant #7276
Conversation
/run-all-tests |
expression/constant_propagation.go
Outdated
ast.GE: true, | ||
ast.NE: true, | ||
// nonDeterministicFuncNameMap stores all the non-deterministic operators that cannot be propagated. | ||
var nonDeterministicFuncNameMap = map[string]bool{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use unFoldableFunctions
in "expression/function_traits.go" to replace this map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK~
expression/constant_propagation.go
Outdated
|
||
condsLen := len(s.conditions) | ||
for j, colj := range s.columns { | ||
for k, colk := range s.columns { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about:
for j, colj := range s.columns {
for k := j+1; k < len(s.columns); k++ {
...
replaced, _, newExpr := s.tryToReplaceCond(colj, colk, cond)
replaced, _, newExpr := s.tryToReplaceCond(colk, colj, cond)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! thank you!
expression/constant_propagation.go
Outdated
func (s *propagateConstantSolver) tryToReplaceCond(src *Column, tgt *Column, cond Expression) (bool, bool, Expression) { | ||
replaced := false | ||
var r *ScalarFunction | ||
if funct, ok := cond.(*ScalarFunction); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/funct/sf/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK~
@bb7133 Thanks! Please fix the CI. |
/run-unit-test |
1 similar comment
/run-unit-test |
expression/constant_propagation.go
Outdated
if _, ok := unFoldableFunctions[sf.FuncName.L]; ok { | ||
return false, true, cond | ||
} | ||
if _, isEq := eqFuncNameMap[sf.FuncName.L]; isEq { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just sf.FuncName.L == ast.EQ
is okay. And add comment why we skip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we still keep eqFuncNameMap
? @winoros
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bb7133 Seems that it can be removed. validPropagateCond
now is only used to check eq cond. You can modify this method and remove the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! @winoros
expression/constant_propagation.go
Outdated
if r == nil { | ||
r = sf.Clone().(*ScalarFunction) | ||
} | ||
r.GetArgs()[idx] = tgt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd better call NewFunction
. Don't set scalar function
's arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it's better to treat scalar function immutable
expression/constant_propagation.go
Outdated
if r == nil { | ||
r = sf.Clone().(*ScalarFunction) | ||
} | ||
r.GetArgs()[idx] = subExpr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
115b0b0
to
e5a08a7
Compare
@bb7133
You can run |
e5a08a7
to
49a99ea
Compare
Fixed, PTAL @winoros, thanks! |
/run-all-tests |
LGTM |
expression/constant_propagation.go
Outdated
if eq, ok := cond.(*ScalarFunction); ok { | ||
if _, ok := funNameMap[eq.FuncName.L]; !ok { | ||
if eq.FuncName.L != ast.EQ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- update the comment of this function since we do not consider non-equal ops in this function.
- s/ validPropagateCond/ validEqualCond?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right
expression/constant_propagation.go
Outdated
// bool: if 'cond' contains non-deterministic expression | ||
// Expression: the replaced expression, or original 'cond' if the replacement is not happened | ||
func (s *propagateConstantSolver) tryToReplaceCond(src *Column, tgt *Column, cond Expression) (bool, bool, Expression) { | ||
if sf, ok := cond.(*ScalarFunction); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sf, ok := cond.(*ScalarFunction); !ok {
return false, false, cond
}
......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
expression/constant_propagation.go
Outdated
if _, ok := unFoldableFunctions[sf.FuncName.L]; ok { | ||
return false, true, cond | ||
} | ||
// Equality is handled in propagateEQ already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- propagateEQ only handles equal functions like
column = constant
but not handle the case like
a = b and b = funcname()
, should we consider it here? - Or may be we do not need propagateEQ now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many optimizations we can consider in constant propagation and now I think it may be a good idea to have a doc so we can:
- Make further discussions and details.
- Make a list of subtasks.
I will try to give this initial doc soon. Your comment is appreciated, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code to support b = func()
case, thanks
expression/constant_propagation.go
Outdated
continue | ||
} | ||
colk := s.columns[k] | ||
for i := 0; i < condsLen; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ condsLen/ len(s.conditions)
and remove line 107
since we only use it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to keep condsLen
because we're adding elements to conditions
while iterating it. Iteration on new elements are not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to comment for the behavior of the 3-layer loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed, thanks
expression/constant_propagation.go
Outdated
// tryToReplaceCond aims to replace all occurances of column 'src' and try to replace it with 'tgt' in 'cond' | ||
// It returns | ||
// bool: if a replacement happened | ||
// bool: if 'cond' contains non-deterministic expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we can remove this return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This boolean value is used to avoid some bad case like: a = b and cast(b, int) < rand()
, for which we cannot propagate a cast(a, int) < rand()
.
There may be more sophisticated solution, do you have any advise? thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, never mind
expression/constant_propagation.go
Outdated
@@ -158,6 +146,51 @@ func (s *propagateConstantSolver) validPropagateCond(cond Expression, funNameMap | |||
return nil, nil | |||
} | |||
|
|||
// tryToReplaceCond aims to replace all occurances of column 'src' and try to replace it with 'tgt' in 'cond' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a more detailed comment here,
an example may help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah,I will try to state more details
/unit-test |
@bb7133 any update? |
code updated, sorry it costed a long time @zz-jason |
5dcb6eb
to
d7b470a
Compare
1. add propagation for EQ condition like 'a eq func' 2. refined some method/variable names 3. added more comments 4. added 1 more unit test Signed-off-by: bb7133 <bb7133@gmail.com>
d7b470a
to
fd15734
Compare
/run-all-tests |
// e.g. For expression a = b and b = c and c = d and c < 1 , we can get extra a < 1 and b < 1 and d < 1. | ||
// However, for a = b and a < rand(), we cannot propagate a < rand() to b < rand() because rand() is non-deterministic | ||
// | ||
// This propagation may bring redundancies that we need to resolve later, for example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the present planner, we can do this by adding a post-process after the logical optimization, before deriving the statistics and build index/table ranges.
In the new cascades planner, we can do this by adding a rule which only works on a Filter
operator and remove the duplicated filters.
@bb7133 What's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current ranger can spot these redundant filters and merge them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eurekaka maybe we can do the enhancement in ranger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like ranger would only simplify and merge these filters when they are considered to be index filters, if there is no index and no int primary key, these filters would be in PhysicalSelection as what they are originally; maybe we should extract the range simplification logic out of ranger, and this should be like the 'post-process' you mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zz-jason Sounds interesting! And looks @tiancaiamao is trying to solve (at least part of) the redundant filters in his PR, which we can expect.
// for a = b and a < 3 and b < 4, we get new a < 4 and b < 3 but should expect a < 3 and b < 3 | ||
// for a = b and a in (3) and b in (4), we get b in (3) and a in (4) but should expect 'false' | ||
// | ||
// TODO: remove redundancies later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to add a github issue for this TODO item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
@tiancaiamao What's the idea of the "aggressive constant propagate"? |
Current constant propagate doesn't infer much on InEQ condition: "a > 3" && "a > 5" => "a > 5" And it doesn't handle function: to_days(t) > date '2018-03-05' && to_days(t) < '2018-03-01' => false I'm writing an "aggressive" one to solve this issue #7516 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM |
/run-all-tests |
Well, @tiancaiamao your idea is mentioned in my TODO list in the comment, it would be great if you can solve it. BTW, I wrote a doc about my opinions on related optimizations: https://docs.google.com/document/d/1G3wVBaiza9GI5q9nwHLCB2I7r1HUbg6RpTNFpQWTFhQ/edit?usp=sharing |
What have you changed? (mandatory)
This PR try to propagate more filters/constraints in PropagateConstant to archive the optimization mentioned in #7098. As long as an expression is deterministic, it can be propagated through the equality condition.
For example(the case in #7098), before this PR:
After this PR, we have:
However, propagation should be avoid for conditions with non-deterministic expression like
rand()
. AnonDeterministicFuncNameMap
is added to make sure those expressions are excluded.What is the type of the changes? (mandatory)
How has this PR been tested? (mandatory)
Does this PR affect documentation (docs/docs-cn) update? (mandatory)
No
Does this PR affect tidb-ansible update? (mandatory)
No
Does this PR need to be added to the release notes? (mandatory)
No
Refer to a related PR or issue link (optional)
#7098
Add a few positive/negative examples (optional)
Since we try to do propagations as much as possible, some propagated filters may be un-necessary, or at least can be folded/combined, for example: