-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ranger: merge multiple EQ or In expressions if possible #7577
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR should not be merged before #7553 is merged, because for queries like:
select /*+ TIDB_INLJ(t1) */ * from t t1 join t t2 where t1.a=t2.a and t2.a = 2
a fake EqCond like t2.a = 1
would be introduced in building IndexLoopUpJoin, and this fake EqCond would be merged with t2.a = 2
after this PR to build an empty range, which is wrong. #7553 fixes this problem, so this PR has dependency on it.
Should the datasouce in the following query be a TableDual? Because
|
@shenli make sense, patch updated. two problems found during converting empty range DataSource into TableDual: First, plan cache would be broken if empty range DataSource is converted to TableDual, check out this example with MySQL [test]> create table t(a int, b int, index a_idx(a));
Query OK, 0 rows affected (0.01 sec)
MySQL [test]> insert into t values(1,1),(2,2),(null,3);
Query OK, 3 rows affected (0.00 sec)
MySQL [test]> select * from t;
+------+------+
| a | b |
+------+------+
| 1 | 1 |
| 2 | 2 |
| NULL | 3 |
+------+------+
3 rows in set (0.00 sec)
MySQL [test]> prepare stmt from 'select * from t where ?';
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> execute stmt using @param;
Empty set (0.00 sec)
MySQL [test]> set @param = true;
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> execute stmt using @param;
Empty set (0.00 sec) last I would add an issue to track this problem separately, and in this patch, we do not apply this converting if we are building plan for prepared statement when plan cache is enabled. second, ranger has a bug regarding MySQL [test]> select * from t;
+------+------+
| a | b |
+------+------+
| 1 | 1 |
| 2 | 2 |
| NULL | 3 |
+------+------+
3 rows in set (0.00 sec)
MySQL [test]> explain select * from t where a = null;
+-------------------+-------+------+--------------------------------------------------+
| id | count | task | operator info |
+-------------------+-------+------+--------------------------------------------------+
| IndexLookUp_10 | 0.00 | root | |
| ├─IndexScan_8 | 0.00 | cop | table:t, index:a, keep order:false, stats:pseudo |
| └─TableScan_9 | 0.00 | cop | table:t, keep order:false, stats:pseudo |
+-------------------+-------+------+--------------------------------------------------+
3 rows in set (0.00 sec)
MySQL [test]> select * from t where a = null;
Empty set (0.00 sec)
currently unit test |
OK, for the second problem mentioned above, I just figured out |
create table t(a bigint primary key); | ||
explain select * from t where a = 1 and a = 2; | ||
id count task operator info | ||
TableDual_5 10000.00 root rows:0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the count
field of the explain seems wrong, it should be 0.00
instead. Is is computed by the following code snippet in function GetRowCountByIntColumnRanges
called by Selectivity
:
404 if coll.ColumnIsInvalid(sc, colID) {
405 if len(intRanges) == 0 {
406 return float64(coll.Count), nil
407 }
408 if intRanges[0].LowVal[0].Kind() == types.KindInt64 {
409 return getPseudoRowCountBySignedIntRanges(intRanges, float64(coll.Count)), nil
410 }
411 return getPseudoRowCountByUnsignedIntRanges(intRanges, float64(coll.Count)), nil
412 }
line 406 does not make sense, it should return 0, nil
when range is empty. Anyway, this should be resolved in another issue, and we can update this result later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check the len(values) to decide whether EQ or In
be converted to TableDual now
…converted to TableDual
f092469
to
4e8f961
Compare
util/ranger/ranger.go
Outdated
sf, _ := expr.(*expression.ScalarFunction) | ||
//Constant and Column args should have same RetType, simply get from first arg | ||
retType := sf.GetArgs()[0].GetType() | ||
values := make([]expression.Expression, 0, len(points)/2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about s/values/args/
?
util/ranger/ranger.go
Outdated
@@ -466,3 +467,33 @@ func newFieldType(tp *types.FieldType) *types.FieldType { | |||
return tp | |||
} | |||
} | |||
|
|||
func points2EqOrInCond(ctx sessionctx.Context, points []point, expr expression.Expression) expression.Expression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a comment for this function to explain:
- the basic functionality.
- the constraints of the input parameters.
how about:
// points2EqOrInCond constructs a 'EQUAL' or 'IN' scalar function based on the
// 'points'. The target column is extracted from the 'expr'.
// NOTE:
// 1. 'expr' must be either 'EQUAL' or 'IN' function.
// 2. 'points' should not be empty.
@@ -252,6 +252,14 @@ func (ds *DataSource) findBestTask(prop *requiredProp) (t task, err error) { | |||
t = invalidTask | |||
|
|||
for _, path := range ds.possibleAccessPaths { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's better to remove other access paths if there is a path which has an empty range. This can be done in DataSource.deriveStats()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@winoros PTAL
util/ranger/detacher.go
Outdated
accesses[offset] = cond | ||
continue | ||
} | ||
//multiple Eq/In conditions for one column in CNF, apply intersection on them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space between //
and the sentence. And capitalize first character.
util/ranger/detacher.go
Outdated
continue | ||
} | ||
//multiple Eq/In conditions for one column in CNF, apply intersection on them | ||
//lazily compute the points for the previously visited Eq/In |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space.
util/ranger/ranger.go
Outdated
// 1. 'expr' must be either 'EQUAL' or 'IN' function. | ||
// 2. 'points' should not be empty. | ||
func points2EqOrInCond(ctx sessionctx.Context, points []point, expr expression.Expression) expression.Expression { | ||
//len(points) cannot be 0 here, since we impose early termination in extractEqAndInCondition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space.
util/ranger/ranger.go
Outdated
func points2EqOrInCond(ctx sessionctx.Context, points []point, expr expression.Expression) expression.Expression { | ||
//len(points) cannot be 0 here, since we impose early termination in extractEqAndInCondition | ||
sf, _ := expr.(*expression.ScalarFunction) | ||
//Constant and Column args should have same RetType, simply get from first arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need some tests in ranger/ranger_test.go
.
rest lgtm.
/run-all-tests |
@winoros comment addressed, PTAL |
/run-all-tests |
/run-common-test tidb-test=pr/620 |
5 similar comments
/run-common-test tidb-test=pr/620 |
/run-common-test tidb-test=pr/620 |
/run-common-test tidb-test=pr/620 |
/run-common-test tidb-test=pr/620 |
/run-common-test tidb-test=pr/620 |
/run-common-test |
/run-integration-common-test |
/run-all-tests |
What problem does this PR solve?
handle more than one 'equal' or 'in' function for a column in ranger #7279
What is changed and how it works?
before this pr:
the filters indeed can be simplified to eliminate unnecessary 'Selections', or even can be removed at all and build an empty range.
after this pr:
Check List
Tests
Related changes
handle more than one 'equal' or 'in' function for a column in ranger