-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimizer lacks feature of OR expansion and predicate elimination #56005
Comments
To process the above query. Optimizer need to first do an OR expansion: explain SELECT * FROM
( SELECT
dt.*
FROM
it
LEFT JOIN dt ON it.pk = dt.pk
WHERE
it.a = "a" AND it.a > "a"
UNION
SELECT
dt.*
FROM
it
LEFT JOIN dt ON it.pk = dt.pk
WHERE it.a = "a"
AND it.a = "a"
AND it.pk > 1
) tb
ORDER BY
tb.pk
LIMIT
240; Then optimizer should eliminate the first branch of the OR expansion, because explain SELECT * FROM
(
SELECT
dt.*
FROM
it
LEFT JOIN dt ON it.pk = dt.pk
WHERE it.a = "a"
AND it.a = "a"
AND it.pk > 1
) tb
ORDER BY
tb.pk
LIMIT
240;
+-------------------------------+---------+-----------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TopN_14 | 41.67 | root | | test.dt.pk, offset:0, count:240 |
| └─IndexJoin_27 | 41.67 | root | | left outer join, inner:TableReader_23, outer key:test.it.pk, inner key:test.dt.pk, equal cond:eq(test.it.pk, test.dt.pk) |
| ├─IndexReader_39(Build) | 33.33 | root | | index:IndexRangeScan_38 |
| │ └─IndexRangeScan_38 | 33.33 | cop[tikv] | table:it, index:f(a, pk) | range:("a" 1,"a" +inf], keep order:false, stats:pseudo |
| └─TableReader_23(Probe) | 11.11 | root | | data:Selection_22 |
| └─Selection_22 | 11.11 | cop[tikv] | | gt(test.dt.pk, 1) |
| └─TableRangeScan_21 | 33.33 | cop[tikv] | table:dt | range: decided by [test.it.pk], keep order:false, stats:pseudo |
+-------------------------------+---------+-----------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
We have separate logic of predicate simplification and index range derivation. I think it can be enhancement for either one and we need to see what is the best way to do it. |
The current logic actually does such simplification but the issue is the different type/collation between it.a (binary) and the constant "a" which is utf8mb4_0900_ai_ci. A workaround is to replace the constants with binary constants like b'......'. |
Yes it works mysql> explain SELECT
-> dt.*
-> FROM
-> it
-> LEFT JOIN dt ON it.pk = dt.pk
-> WHERE
-> it.a = 0xaa
-> AND (
-> (
-> it.a > 0xaa
-> )
-> OR (
-> it.a = 0xaa AND it.pk > 1
-> )
-> )
-> ORDER BY
-> it.pk
-> LIMIT
-> 240;
+-------------------------------------+---------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------------+---------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_12 | 41.67 | root | | test.dt.a, test.dt.pk, test.dt.b, test.dt.c |
| └─Limit_19 | 41.67 | root | | offset:0, count:240 |
| └─IndexHashJoin_71 | 41.67 | root | | left outer join, inner:TableReader_66, outer key:test.it.pk, inner key:test.dt.pk, equal cond:eq(test.it.pk, test.dt.pk), other cond:or(gt(test.it.a, "0xaa"), and(eq(test.it.a, "0xaa"), gt(test.dt.pk, 1))) |
| ├─Limit_36(Build) | 33.33 | root | | offset:0, count:240 |
| │ └─IndexReader_46 | 33.33 | root | | index:Limit_45 |
| │ └─Limit_45 | 33.33 | cop[tikv] | | offset:0, count:240 |
| │ └─IndexRangeScan_44 | 33.33 | cop[tikv] | table:it, index:f(a, pk) | range:("\xaa" 1,"\xaa" +inf], keep order:true, stats:pseudo |
| └─TableReader_66(Probe) | 33.33 | root | | data:TableRangeScan_65 |
| └─TableRangeScan_65 | 33.33 | cop[tikv] | table:dt | range: decided by [test.it.pk], keep order:false, stats:pseudo |
+-------------------------------------+---------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec) |
Bug Report
1. Minimal reproduce step (Required)
2. What did you expect to see? (Required)
There is an index
KEY
f(
a,
pk)
, and there is where predicateit.a = "a" AND it.pk > 1
. We expect to seerange:("a" 1,"a" +inf]
in execution plan3. What did you see instead (Required)
This is the full execution plan
We see
range:["a","a"]
. This is causing unnecessary additional data access and poor performance.4. What is your TiDB version? (Required)
The text was updated successfully, but these errors were encountered: