-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
collations: fix sorting in UCA900 collations #12555
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! Signed-off-by: Vicent Marti <vmg@strn.cat>
vitess-bot
bot
added
NeedsDescriptionUpdate
The description is not clear or comprehensive enough, and needs work
NeedsWebsiteDocsUpdate
What it says
labels
Mar 6, 2023
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
Signed-off-by: Vicent Marti <vmg@strn.cat>
dbussink
approved these changes
Mar 6, 2023
dbussink
added
Backport to: release-14.0
and removed
NeedsDescriptionUpdate
The description is not clear or comprehensive enough, and needs work
NeedsWebsiteDocsUpdate
What it says
Backport to: release-14.0
labels
Mar 6, 2023
systay
approved these changes
Mar 6, 2023
I was unable to backport this Pull Request to the following branches: |
vmg
added a commit
to planetscale/vitess
that referenced
this pull request
Mar 7, 2023
* collations: fix sorting in UCA900 collations When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: comment Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat>
4 tasks
vmg
added a commit
to planetscale/vitess
that referenced
this pull request
Mar 7, 2023
* collations: fix sorting in UCA900 collations When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: comment Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat>
4 tasks
frouioui
pushed a commit
that referenced
this pull request
Mar 8, 2023
…12556) * collations: fix sorting in UCA900 collations When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: comment Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat> Co-authored-by: Vicent Marti <vmg@strn.cat>
vmg
added a commit
that referenced
this pull request
Mar 8, 2023
* collations: fix sorting in UCA900 collations When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! * collations: comment --------- Signed-off-by: Vicent Marti <vmg@strn.cat>
vmg
added a commit
that referenced
this pull request
Mar 8, 2023
* collations: fix sorting in UCA900 collations When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! * collations: comment --------- Signed-off-by: Vicent Marti <vmg@strn.cat>
frouioui
added a commit
to planetscale/vitess
that referenced
this pull request
Nov 21, 2023
* Increase retries and timeout for expectation (vitessio#1628) This still flakes with the current setup, but the total wait time today is only a maximum of 50ms. With increasing the retries and the time between them, we now retry for up to 5 seconds. Anecdotally, Actions can see stalls up to like 2 seconds so we need a large retry time window here to improve test reliability. This also means we can remove a lot of other sleeps here since the expectation will retry anyway, improving the happy path test performance. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * boost: Implement logic to mimic global routing in Vitess for Boost (vitessio#1636) * boost: Implement logic to mimic global routing in Vitess for Boost In Vitess if there's no keyspace currently active, it's still possible to route a query to the right keyspace. We don't currently support this in Boost which means there's a mismatch between how it runs directly vs. when Boost is used. We need this for supporting planetscale/boost-internal#21 but also need to relax some constraints on Singularity as well as reverting the changes in planetscale/api-bb#6463 & planetscale/api-bb#6476. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * Improve error message for no keyspace Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * Handle routing cached queries with no keyspace When you install a query with no keyspace configured, we want to be able to match that whether you have a keyspace selected or not. This allows that to work by excluding the keyspace from the hash but matching it after the fact against the given keyspace, or allowing it if it's empty. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> --------- Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * boost: Retry leadership election if it fails (vitessio#1649) When we have an etcd blip right at the moment we also want to do the leadership election here, we don't correctly retry. The change here sets up a separate context we can cancel and then retry so we close the old watcher and spin up a new one in case this happens. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * Collations & Evalengine Backport (vitessio#1646) * sqlparser: Update tooling Signed-off-by: Vicent Marti <vmg@strn.cat> * BugFix: Escaping Percentage and Underscore require special handling (vitessio#11823) * test: add failing end to end test for escaped % and _ Signed-off-by: Manan Gupta <manan@planetscale.com> * feat: preserve escaping of \ and % in the parser Signed-off-by: Manan Gupta <manan@planetscale.com> * feat: add end to end test and planner test to verify that % and _ handling is correct in Vitess on vtgate level Signed-off-by: Manan Gupta <manan@planetscale.com> * feat: add test to evaluation engine to verify that backslash and underscore and handled correctly even when escaped Signed-off-by: Manan Gupta <manan@planetscale.com> * feat: remove unrequired changes Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: fix sorting in UCA900 collations (vitessio#12555) * collations: fix sorting in UCA900 collations When using the fast iterator to _compare_ two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation! Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: comment Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: Refactorings & fixes (vitessio#12554) * evalengine: add a compiler struct Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: refactor arithmetic ops Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: rename type check APIs Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/json: extract APIs Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/bit: add comment for behavior Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/call: make interface an Expr Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/json: extract conversion API Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/compare: fix multi-comparison behavior Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/convert: extract decimal precision API Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/bit: extract error Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/bytes: extract toNumericHex Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine/integration: move all tests into their own package Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: expose the charset package Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: remove unused file Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: add license headers Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: upgrade to MySQL 8.0.32 (vitessio#12557) * collations: update from MySQL 8.0.32 Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: use an array to store collations Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: remove Init from Collation interface Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: perform eager initialiation and simplify APIs Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: apply utf8mb3 <-> utf8 in both directions Signed-off-by: Vicent Marti <vmg@strn.cat> * collations: expose the Binary collation Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: More cleanup (vitessio#12573) * evalengine: fix multi-comparisons Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: add a limit for REPEAT Signed-off-by: Vicent Marti <vmg@strn.cat> * evalengine: Additional fixes for edge cases This solves edge cases around decimal promotion which follows a conversion to int64 (not uint64) and then is casted to uint64 for all bitwise operations. Also adds a bunch of tests for the hex() function and implemented conversions for non string like types. It also found more bugs in MySQL CASE handling, yay. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> --------- Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> * boost: update collation APIs Signed-off-by: Vicent Marti <vmg@strn.cat> --------- Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * boost: Improve upquery generation (vitessio#1650) This fixes upquery generation for a union with group bys where there's a literal projection by ensuring we match the generation of the query to what we do without a join. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * [boost planner] Union on top of union (vitessio#1645) * Add tests for broken unions Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * attempt to allow building of unions on top of unions Signed-off-by: Andres Taylor <andres@planetscale.com> * use the right columns even for UNION ALL Signed-off-by: Andres Taylor <andres@planetscale.com> * fix derived tables and projections after TopK Signed-off-by: Andres Taylor <andres@planetscale.com> * use group by with count(*) instead of distinct Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> * flownode: implement proper ordering trace for Readers Signed-off-by: Vicent Marti <vmg@strn.cat> * check one the right node Signed-off-by: Andres Taylor <andres@planetscale.com> * allow expressions from inside derived table to be handled correctly Signed-off-by: Andres Taylor <andres@planetscale.com> * boost: fix indeterministic union parents Signed-off-by: Vicent Marti <vmg@strn.cat> * Removed unused distinct flownode Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> --------- Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Signed-off-by: Vicent Marti <vmg@strn.cat> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * Fix test expectation Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> --------- Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Vicent Marti <vmg@strn.cat>
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When using the fast iterator to compare two strings with an UCA collation, we need to keep in mind that the weights in the collation are in BIG ENDIAN (this is the output format for the weight strings, so we store the weights this way), so comparing them directly will not result in the proper collation order. They need to be byte-swapped before they can be compared with an arithmetic operation!
Related Issue(s)
Checklist
Deployment Notes