Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: do not use like to build range when new collation is enabled #31278

Merged
merged 21 commits into from
Jan 21, 2022

Conversation

Reminiscent
Copy link
Contributor

@Reminiscent Reminiscent commented Jan 4, 2022

What problem does this PR solve?

Issue Number: close #31174

Problem Summary:
The like condition will be used to build range, but it can not handle some situations.

What is changed and how it works?

You can see here for more details. But in this PR, we only forbid using the like condition to build range when new collation is enabled and the column type is non-binary collation string.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@Reminiscent Reminiscent added type/bugfix This PR fixes a bug. sig/planner SIG: Planner labels Jan 4, 2022
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jan 4, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • tangenta
  • xiongjiwei

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 4, 2022
@ti-chi-bot ti-chi-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 5, 2022
@bb7133
Copy link
Member

bb7133 commented Jan 6, 2022

@Reminiscent Please update the test cases:

https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_check/detail/tidb_ghpr_check/43756/pipeline

[2022-01-05T08:16:30.338Z] [2022/01/05 16:16:29.075 +08:00] [FATAL] [main.go:748] ["run test"] [test=explain_generate_column_substitute] [error="sql:explain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';: run \"explain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';\" at line 182 err, we need:\nexplain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';\nid\testRows\ttask\taccess object\toperator info\nStreamAgg\t1.00\troot\t\tfuncs:count(Column#6)->Column#4\n└─IndexReader\t1.00\troot\t\tindex:StreamAgg\n  └─StreamAgg\t1.00\tcop[tikv]\t\tfuncs:count(1)->Column#6\n    └─IndexRangeScan\t250.00\tcop[tikv]\ttable:tbl1, index:expression_index(md5(`s`))\trange:[\"02e74f10e0327ad868d138f2b4fdd6f\",\"02e74f10e0327ad868d138f2b4fdd6g\"), keep order:false, stats:pseudo\nselect count(*) from tbl1 use index() where md5(s)\nbut got:\nexplain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';\nid\testRows\ttask\taccess object\toperator info\nStreamAgg\t1.00\troot\t\tfuncs:count(Column#6)->Column#4\n└─IndexReader\t1.00\troot\t\tindex:StreamAgg\n  └─StreamAgg\t1.00\tcop[tikv]\t\tfuncs:count(1)->Column#6\n    └─Selection\t8000.00\tcop[tikv]\t\tlike(md5(cast(test.tbl1.s, var_string(20))), \"02e74f10e0327ad868d138f2b4fdd6f%\", 92)\n      └─IndexFullScan\t10000.00\tcop[tikv]\ttable:tbl1, index:expression_index(md5(`s`))\tkeep order:false, stats:pseudo\n\n"] [errorVerbose="run \"explain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';\" at line 182 err, we need:\nexplain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';\nid\testRows\ttask\taccess object\toperator info\nStreamAgg\t1.00\troot\t\tfuncs:count(Column#6)->Column#4\n└─IndexReader\t1.00\troot\t\tindex:StreamAgg\n  └─StreamAgg\t1.00\tcop[tikv]\t\tfuncs:count(1)->Column#6\n    └─IndexRangeScan\t250.00\tcop[tikv]\ttable:tbl1, index:expression_index(md5(`s`))\trange:[\"02e74f10e0327ad868d138f2b4fdd6f\",\"02e74f10e0327ad868d138f2b4fdd6g\"), keep order:false, stats:pseudo\nselect count(*) from tbl1 use index() where md5(s)\nbut got:\nexplain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';\nid\testRows\ttask\taccess object\toperator info\nStreamAgg\t1.00\troot\t\tfuncs:count(Column#6)->Column#4\n└─IndexReader\t1.00\troot\t\tindex:StreamAgg\n  └─StreamAgg\t1.00\tcop[tikv]\t\tfuncs:count(1)->Column#6\n    └─Selection\t8000.00\tcop[tikv]\t\tlike(md5(cast(test.tbl1.s, var_string(20))), \"02e74f10e0327ad868d138f2b4fdd6f%\", 92)\n      └─IndexFullScan\t10000.00\tcop[tikv]\ttable:tbl1, index:expression_index(md5(`s`))\tkeep order:false, stats:pseudo\n\n\nmain.(*tester).execute\n\t/home/jenkins/agent/workspace/tidb_ghpr_check/go/src/github.com/pingcap/tidb/cmd/explaintest/main.go:391\nmain.(*tester).Run\n\t/home/jenkins/agent/workspace/tidb_ghpr_check/go/src/github.com/pingcap/tidb/cmd/explaintest/main.go:176\nmain.main\n\t/home/jenkins/agent/workspace/tidb_ghpr_check/go/src/github.com/pingcap/tidb/cmd/explaintest/main.go:747\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:225\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371\nsql:explain format = 'brief' select count(*) from tbl1 where md5(s) like '02e74f10e0327ad868d138f2b4fdd6f%';"] 

@sre-bot
Copy link
Contributor

sre-bot commented Jan 11, 2022

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 17, 2022
@Reminiscent
Copy link
Contributor Author

@tangenta Update. PTAL

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jan 20, 2022
@bb7133
Copy link
Member

bb7133 commented Jan 20, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 6c9ae01

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 20, 2022
@bb7133
Copy link
Member

bb7133 commented Jan 21, 2022

/run-unit-test

1 similar comment
@bb7133
Copy link
Member

bb7133 commented Jan 21, 2022

/run-unit-test

@winoros
Copy link
Member

winoros commented Apr 11, 2022

/run-cherry-picker release-5.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LIKE does not work as expected for utf8_general_ci strings
9 participants