Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: the optimizer cannot convert OUTER JOIN to INNER JOIN with nested AND/OR predicates in some cases #49616

Closed
qw4990 opened this issue Dec 20, 2023 · 7 comments · Fixed by #49625
Labels
affects-6.5 affects-7.1 affects-7.5 feature/reviewing This feature request is reviewing by product managers needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. report/customer Customers have encountered this bug. sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement. type/performance type/regression

Comments

@qw4990
Copy link
Contributor

qw4990 commented Dec 20, 2023

Enhancement

create table t1 (k int, a int);
create table t2 (k int, b int, key(k));

explain select /*+ tidb_inlj(t2, t1) */ *
  from t2 left join t1 
    on t1.k=t2.k
  where 
    a>0 or (a=0 and b>0);

a>=0

+--------------------------------+----------+-----------+---------------+---------------------------------------------------------------+
| id                             | estRows  | task      | access object | operator info                                                 |
+--------------------------------+----------+-----------+---------------+---------------------------------------------------------------+
| Selection_7                    | 9990.00  | root      |               | or(gt(test.t1.a, 0), and(eq(test.t1.a, 0), gt(test.t2.b, 0))) |
| └─HashJoin_8                   | 12487.50 | root      |               | left outer join, equal:[eq(test.t2.k, test.t1.k)]             |
|   ├─TableReader_14(Build)      | 9990.00  | root      |               | data:Selection_13                                             |
|   │ └─Selection_13             | 9990.00  | cop[tikv] |               | not(isnull(test.t1.k))                                        |
|   │   └─TableFullScan_12       | 10000.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                                |
|   └─TableReader_11(Probe)      | 10000.00 | root      |               | data:TableFullScan_10                                         |
|     └─TableFullScan_10         | 10000.00 | cop[tikv] | table:t2      | keep order:false, stats:pseudo                                |
+--------------------------------+----------+-----------+---------------+---------------------------------------------------------------+

In the case above, the tidb_inlj hint doesn't take effect, because the optimizer cannot convert this OUTER JOIN into an INNER JOIN, since the predicate a>0 or (a=0 and b>0) doesn't pass the null-rejective check (see isNullRejected, evaluateExprWithNullInNullRejectCheck).
But actually, the predicate is null-rejective since no matter the value of b, if a is null, then the eventual value of this predicate is null, and so all unmatched OUTER JOIN rows (whose a is null) will be eliminated by this predicate. So it's safe to convert this OUTER JOIN to an INNER JOIN, but the optimizer didn't recognize this.

This issue is caused by #38430, our current null-rejective check is too strict.

@qw4990 qw4990 added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner feature/reviewing This feature request is reviewing by product managers labels Dec 20, 2023
@qw4990
Copy link
Contributor Author

qw4990 commented Dec 20, 2023

// specialNullRejectedCase1 is mainly for #49616.
// Case1 specially handles `null-rejected OR (null-rejected AND {others})`, then no matter what the result
// of `{others}` is (True, False or Null), the result of this predicate is null, so this predicate is null-rejected.

ti-chi-bot bot pushed a commit that referenced this issue Dec 20, 2023
…o INNER JOIN with nested AND/OR in some cases (#49625)

close #49616
@qw4990 qw4990 added needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. labels Dec 21, 2023
ti-chi-bot bot pushed a commit that referenced this issue Dec 21, 2023
…o INNER JOIN with nested AND/OR in some cases (#49648)

close #49616
ti-chi-bot bot pushed a commit that referenced this issue Dec 22, 2023
…o INNER JOIN with nested AND/OR in some cases (#49625) (#49638)

close #49616
@kennedy8312
Copy link

/type regression

@kennedy8312
Copy link

Regression Analysis
PR caused this regression: #38430

@kennedy8312
Copy link

mysql> explain select /*+ tidb_inlj(t2, t1) */ *
-> from t2 left join t1
-> on t1.k=t2.k
-> where
-> a>0 or (a=0 and b>0);
+--------------------------------+----------+-----------+---------------+---------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------------+----------+-----------+---------------+---------------------------------------------------------------+
| Selection_7 | 9990.00 | root | | or(gt(test.t1.a, 0), and(eq(test.t1.a, 0), gt(test.t2.b, 0))) |
| └─HashJoin_8 | 12487.50 | root | | left outer join, equal:[eq(test.t2.k, test.t1.k)] |
| ├─TableReader_14(Build) | 9990.00 | root | | data:Selection_13 |
| │ └─Selection_13 | 9990.00 | cop[tikv] | | not(isnull(test.t1.k)) |
| │ └─TableFullScan_12 | 10000.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
| └─TableReader_11(Probe) | 10000.00 | root | | data:TableFullScan_10 |
| └─TableFullScan_10 | 10000.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
+--------------------------------+----------+-----------+---------------+---------------------------------------------------------------+
7 rows in set, 1 warning (0.01 sec)

mysql> select tidb_version();
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version() |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v6.4.0
Edition: Community
Git Commit Hash: cf36a9c
Git Branch: heads/refs/tags/v6.4.0
UTC Build Time: 2022-11-13 05:15:45
GoVersion: go1.19
Race Enabled: false
TiKV Min Version: 6.2.0-alpha
Check Table Before Drop: false
Store: tikv |

@kennedy8312
Copy link

mysql> explain select /*+ tidb_inlj(t2, t1) */ *
-> from t2 left join t1
-> on t1.k=t2.k
-> where
-> a>0 or (a=0 and b>0);
+---------------------------------+----------+-----------+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+---------------------------------+----------+-----------+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_11 | 4174.99 | root | | inner join, inner:IndexLookUp_10, outer key:test.t1.k, inner key:test.t2.k, equal cond:eq(test.t1.k, test.t2.k), other cond:or(gt(test.t1.a, 0), and(eq(test.t1.a, 0), gt(test.t2.b, 0))) |
| ├─TableReader_23(Build) | 3339.99 | root | | data:Selection_22 |
| │ └─Selection_22 | 3339.99 | cop[tikv] | | not(isnull(test.t1.k)), or(gt(test.t1.a, 0), eq(test.t1.a, 0)) |
| │ └─TableFullScan_21 | 10000.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
| └─IndexLookUp_10(Probe) | 1.25 | root | | |
| ├─Selection_9(Build) | 1.25 | cop[tikv] | | not(isnull(test.t2.k)) |
| │ └─IndexRangeScan_7 | 1.25 | cop[tikv] | table:t2, index:k(k) | range: decided by [eq(test.t2.k, test.t1.k)], keep order:false, stats:pseudo |
| └─TableRowIDScan_8(Probe) | 1.25 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
+---------------------------------+----------+-----------+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.00 sec)

mysql> select tidb_version();
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version() |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v6.3.0
Edition: Community
Git Commit Hash: ecd6753
Git Branch: heads/refs/tags/v6.3.0
UTC Build Time: 2022-09-23 14:21:08
GoVersion: go1.19.1
Race Enabled: false
TiKV Min Version: 6.2.0-alpha
Check Table Before Drop: false
Store: tikv |

@kennedy8312
Copy link

/type performance

ti-chi-bot bot pushed a commit that referenced this issue Feb 18, 2024
…o INNER JOIN with nested AND/OR in some cases (#49625) (#49640)

close #49616
ti-chi-bot bot pushed a commit that referenced this issue Feb 27, 2024
…o INNER JOIN with nested AND/OR in some cases (#49625) (#49639)

close #49616
@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 affects-7.5 feature/reviewing This feature request is reviewing by product managers needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. report/customer Customers have encountered this bug. sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement. type/performance type/regression
Projects
None yet
4 participants