Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: address collation ambiguity in scalar function construction during predicate simplification. #57049

Merged
merged 4 commits into from
Nov 18, 2024

Conversation

dash12653
Copy link
Contributor

@dash12653 dash12653 commented Nov 1, 2024

What problem does this PR solve?

Issue Number: close #56479

Problem Summary:

What changed and how does it work?

We can simplify the sql as follows:

DROP TABLE t1;

CREATE TABLE `t1` (
  `c1` VARCHAR(175) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'asMF'
);

SELECT * 
FROM t1 
WHERE c1 BETWEEN 'string1' AND 'string2' AND (c1 = 'string3' OR IsNull(c1));

When rewriting the BETWEEN and AND clause, the collations of string1 and string2 will be set to c1's collation—utf8mb4_unicode_ci rather than collation_connection.

During predicate simplification, a scalar function 'ge' will be constructed using string1 and string3 as parameters. However, since string1 has a collation of utf8mb4_unicode_ci and string3 has a collation of utf8mb4_general_ci (collation_connection), and both of them have a coercibility of 4, there is ambiguity regarding which collation to use. This leads to an failure to construct a new scalar function, which will lead a panic.

There's an additional concern: if we replace "BETWEEN and AND" with "<= and >=", then both string1 and string3 have a collation of utf8mb4_general_ci(collation_connection), then during predicate simplification, string1 and string3 would be compared using the utf8mb4_general_ci collation. This might lead to potential incorrect results.

Maybe we can:

During predicate simplification, we could reset the collation of the constants (string1, string2, string3) to match the collation of the column c1 to address the problems.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Copy link

ti-chi-bot bot commented Nov 1, 2024

Hi @dash12653. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot added needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 1, 2024
Copy link

tiprow bot commented Nov 1, 2024

Hi @dash12653. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dash12653 dash12653 changed the title Address Collation Ambiguity in Scalar Function Construction During Predicate Simplification. address collation ambiguity in scalar function construction during predicate simplification. Nov 1, 2024
@dash12653 dash12653 changed the title address collation ambiguity in scalar function construction during predicate simplification. planner: address collation ambiguity in scalar function construction during predicate simplification. Nov 1, 2024
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. do-not-merge/needs-triage-completed labels Nov 1, 2024
if equalValueCollation != otherValueCollation {
return false
}
equalValue.GetArgs()[1].GetType(evalCtx).SetCollate(equalValueCollation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should we set it back? it has not changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the WHERE clause : WHERE c1 BETWEEN 'string1' AND 'string2' AND (c1 = 'string3' OR IsNull(c1)), 'string1' and 'string2' are set to the column's collation (utf8mb4_unicode_ci) when rewriting BETWEEN ... AND, but 'string3' is set to the connection-level collation (utf8mb4_general_ci). Before comparing "string1" and "string3" here, to avoid such collation mismatches, I explicitly reset the collations for the string constants.

Copy link
Contributor

@AilinKid AilinKid Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a deriveCollation(ctx, funcName, args, retType, retType) inside, which serves collation mismatch, while it's embedded in cast function, so i guess the better way here is to build a wrapper cast function BuildCastFunction as L237's new child but not for sure, you can have a try

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! I’ll check out deriveCollation and try building a wrapper cast function like BuildCastFunction as you suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, after discussion with @time-and-fate , seems we could use his suggested way, which will be more clear.

@AilinKid
Copy link
Contributor

AilinKid commented Nov 14, 2024

@dash12653 hi, thanks for your contribution, do you mind having some updates for this pull request recently, we are lanching planner-related issue resolution campaign

Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the problem in the case is actually an error from NewFunction but ignored, I would suggest changing the existing NewFunctionInternal to NewFunction and correctly handling the error.

@dash12653
Copy link
Contributor Author

@dash12653 hi, thanks for your contribution, do you mind having some updates for this pull request recently, we are lanching planner-related issue resolution campaign

Thanks for letting me know! I’ll update the PR accordingly. Let me know if you need anything else.

@dash12653
Copy link
Contributor Author

Considering the problem in the case is actually an error from NewFunction but ignored, I would suggest changing the existing NewFunctionInternal to NewFunction and correctly handling the error.

Thanks for the feedback! Just to clarify, are you suggesting that I should only update the function use and handle the error, and that my previous changes can be discarded?

@time-and-fate
Copy link
Member

time-and-fate commented Nov 15, 2024

Considering the problem in the case is actually an error from NewFunction but ignored, I would suggest changing the existing NewFunctionInternal to NewFunction and correctly handling the error.

Thanks for the feedback! Just to clarify, are you suggesting that I should only update the function use and handle the error, and that my previous changes can be discarded?

Yes.
Probably we can be smarter in handling collation mismatch, but it's not quite an appropriate place to add new logic to handle collation mismatch here.
If we focus on the bug itself, the direct reason is that: the error inside NewFunctionInternal is ignored and the returning nil remains unhandled, then goes into the next step, causing panic. Besides, NewFunctionInternal is also marked deprecated.
So, for this bug, I think it's better to switch to using NewFunction and handle the error.

@time-and-fate
Copy link
Member

/ok-to-test

@ti-chi-bot ti-chi-bot bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Nov 15, 2024
@dash12653
Copy link
Contributor Author

Considering the problem in the case is actually an error from NewFunction but ignored, I would suggest changing the existing NewFunctionInternal to NewFunction and correctly handling the error.

Thanks for the feedback! Just to clarify, are you suggesting that I should only update the function use and handle the error, and that my previous changes can be discarded?

Yes. Probably we can be smarter in handling collation mismatch, but it's not quite an appropriate place to add new logic to handle collation mismatch here. If we focus on the bug itself, the direct reason is that: the error inside NewFunctionInternal is ignored and the returning nil remains unhandled, then goes into the next step, causing panic. Besides, NewFunctionInternal is also marked deprecated. So, for this bug, I think it's better to switch to using NewFunction and handle the error.

Got it. I’ll try NewFunction and handling the error.

Copy link

codecov bot commented Nov 15, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 3 lines in your changes missing coverage. Please review.

Project coverage is 74.7315%. Comparing base (ef8cac2) to head (10ab2c3).
Report is 15 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #57049        +/-   ##
================================================
+ Coverage   72.8597%   74.7315%   +1.8718%     
================================================
  Files          1672       1717        +45     
  Lines        462630     470817      +8187     
================================================
+ Hits         337071     351849     +14778     
+ Misses       104795      96851      -7944     
- Partials      20764      22117      +1353     
Flag Coverage Δ
integration 49.2614% <66.6666%> (?)
unit 72.2556% <33.3333%> (+0.0070%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.7673% <ø> (ø)
parser ∅ <ø> (∅)
br 60.6787% <ø> (+15.5530%) ⬆️
---- 🚨 Try these New Features:

@dash12653
Copy link
Contributor Author

/retest

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Nov 17, 2024
Copy link

ti-chi-bot bot commented Nov 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AilinKid, time-and-fate

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Nov 18, 2024
Copy link

ti-chi-bot bot commented Nov 18, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-11-17 14:08:36.649149589 +0000 UTC m=+797278.840018586: ☑️ agreed by time-and-fate.
  • 2024-11-18 03:59:00.674173863 +0000 UTC m=+847102.865042860: ☑️ agreed by AilinKid.

@fixdb
Copy link
Contributor

fixdb commented Nov 18, 2024

/retest

@ti-chi-bot ti-chi-bot bot merged commit a9c5201 into pingcap:master Nov 18, 2024
24 checks passed
@dash12653 dash12653 deleted the tidb-56479 branch November 18, 2024 13:53
@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Nov 19, 2024
ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Nov 19, 2024
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #57476.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unexpected panic during PredicateSimplification
5 participants