-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner: address collation ambiguity in scalar function construction during predicate simplification. (#57049) #57476
planner: address collation ambiguity in scalar function construction during predicate simplification. (#57049) #57476
Conversation
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-8.5 #57476 +/- ##
================================================
Coverage ? 56.9861%
================================================
Files ? 1770
Lines ? 625943
Branches ? 0
================================================
Hits ? 356701
Misses ? 245134
Partials ? 24108
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AilinKid, fixdb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is an automated cherry-pick of #57049
What problem does this PR solve?
Issue Number: close #56479
Problem Summary:
What changed and how does it work?
We can simplify the sql as follows:
DROP TABLE t1;
CREATE TABLE
t1
(c1
VARCHAR(175) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'asMF');
SELECT *
FROM t1
WHERE c1 BETWEEN 'string1' AND 'string2' AND (c1 = 'string3' OR IsNull(c1));
When rewriting the BETWEEN and AND clause, the collations of string1 and string2 will be set to c1's collation—utf8mb4_unicode_ci rather than collation_connection.
During predicate simplification, a scalar function 'ge' will be constructed using string1 and string3 as parameters. However, since string1 has a collation of utf8mb4_unicode_ci and string3 has a collation of utf8mb4_general_ci (collation_connection), and both of them have a coercibility of 4, there is ambiguity regarding which collation to use. This leads to an failure to construct a new scalar function, which will lead a panic.
There's an additional concern: if we replace "BETWEEN and AND" with "<= and >=", then both string1 and string3 have a collation of utf8mb4_general_ci(collation_connection), then during predicate simplification, string1 and string3 would be compared using the utf8mb4_general_ci collation. This might lead to potential incorrect results.
Maybe we can:
During predicate simplification, we could reset the collation of the constants (string1, string2, string3) to match the collation of the column c1 to address the problems.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.